AUGUST 1990 

COMPUTER 


FILE EDIT VIEW INSERT UTILITIES 































UST WHEN YOU THOUGHT IT WAS SAFE TO SIT 
DOWN AT YOUR PC SERIOUS ADA FOR DOS. 

M Almost everyone who programs advanced military systems uses Ada. But until now, you 
could only use serious Ada and Ada development tools on high-powered workstations and 
larger computer systems. □ Now, TeleSoft offers the complete set of Ada productivity tools— 
including our TeleGen2 Ada Compiler, Source Level Debugger, Library Management, Global 
Optimizer, Ada Profiler, Ada Language Tools for PC DOS, 386 UNIX and MAC II computers. 



TeleSoft’s total service and support program, you can count on significantly increasing your 


productivity. □ For complete details on serious Ada tools for DOS, call TeleSoft today at 



(619) 457-2700. And let us show you why they’re the right tools for your PCs. 



McDonnell Douglas and General 
Dynamics are using TeleGen2 Ada 
Development Systems targeting 
embedded 1750A processors to 
help upgrade existing fighter aircraft. 
They are also using TeleGen2 Ada to 
create real-time software for ad¬ 
vanced technology avionics programs. 



Programmed for Productivity 


Reader Service Number 1 











New for network analysts and designers 

Free trial and, if you act now, free training 

LANNET II.5 COMNET II.5 

Local Area Networks Wide Area Networks 




L ANNET II.5 uses simulation to predict 
your LAN performance. You simply 
describe your LAN and workload. 

Animated simulation follows immediately 
-no programming. 

Easy-to-understand results 

You get an animated picture of your LAN. 
System bottlenecks and changing levels of 
utilization are apparent. 

Your reports show LAN statistics such as 
transfer times, delays, and queues. Client, server, 
and gateway statistics show queue lengths, 
waiting times, and messages sent. 

Your LAN simulated 

You can predict the performance of any LAN. 
Industry standard protocols such as Ethernet, 
Token Ring, and Token Bus are built-in. Varia¬ 
tions can be modeled. 


C OMNET II.5 uses simulation to predict 
your network performance. You simply 
describe your network, traffic load, and routing 
algorithms. 

Animated simulation follows immediately 
—no programming. 

Easy-to-understand results 

You get an animated picture of your network. 
Routing choices and changing levels of network 
utilization are apparent. 

Your reports show response times, blocking 
probabilities, call queueing and packet delays, 
network throughput, circuit group utilization, 
and circuit group queue statistics. 

Your network simulated 

You can include LAN’s and multidrop lines in 
your model. X.25, SNA, DECNET, ISDN, fast 
packet, TCP/IP, token passing, and CSMA/CD 
are easily modeled. 


Free Trial Offer 

The free trial contains everything you need to 
try LANNET II.5™ or COMNET II.5™ on 
your PC, Workstation, or Mainframe. Act now 
for free training-no cost, no obligation. 

For immediate information 

For LANNET II.5 call Cliff Baker or for 
COMNET II.5 call Chris LeBaron at (619) 
457-9681, Fax (619) 457-1184. In Europe, call 
Nigel McNamara, in the UK, on (081) 332-0122, 
Fax (081) 332-0112. Worldwide, call Joe Lenz, in 
the US, (619) 457-9681, Fax (619) 457-1184. 

University faculty should call about our 
special offer for research and teaching. 


Rush free trial & training information for: 

□ LANNET II.5 □ COMNET II.5 


Operating System 




IEEE COMP 


LANNET II.5 and COMNET II. . 


PRODUCTS COMPANY. ©1990 ( 


: PRODUCTS COMPANY. 




















COMPUTER 


August 1990 Published by the IEEE Computer Society Vol. 23, No. 8 


ARTICLES 


|| Guest Editor’s Introduction: Voice in Computing 

Ragui Kamel 

1 0 Voice in Computing: An Overview of Available Technologies 

Carl R. Strathmeyer 

Barriers between voice and computing are falling. As practitioners come to understand the technologies at the intersection of these 
disciplines, new applications are sure to appear. 

-1 H Text-to-Speech Conversion Technology 

- 1 7 Michael H. O’Malley 

Reading an English text out loud is currently the most successful simulation by a computer of a complex human mental function. 
This article shows how it can be done. 


26 

35 

43 

50 

59 

66 

73 


An Introduction to Speech and Speaker Recognition 

Richard D. Peacocke and Daryl H. Graf 

Speech recognition, the ability to identify spoken words, and speaker recognition, the ability to identify who is saying them, are 
becoming commonplace applications of speech processing technology. 

Putting Speech Recognition to Work in the Telephone Network 

Matthew Lennig 

The early success of an automated call-handling system using interactive voice technologies foreshadows huge savings for telephone 
companies and a wealth of new services for consumers. 

Anser: An Application of Speech Technology to the Japanese Banking Industry 

Ryohei Nakatsu 

Anser combines speech recognition and synthesis to offer telephone banking services to millions of customers. New technology will 
soon make the system cheaper and expand its uses. 

Augmenting a Window System with Speech Input 

Chris Schmandt, Mark S. Ackerman, and Debhy Hindus 

With Xspeak, window navigation tasks usually performed with a mouse can be controlled by voice. A new version, Xspeak II, 
incorporates a language for translating spoken commands. 

Talk and Draw: Bundling Speech and Graphics 

Mark W. Salisbury, Joseph H. Hendrickson, Terence L. Lammers, Caroline Fu, and Scott A. Moody 
Responding to simultaneous spoken and graphical input, a computer interface in development for the AWACS defense system 
increases operator effectiveness in directing military resources. 

Extending the Notion of a Window System to Audio 

Lester F. Ludwig, Natalio Pincever, and Michael Cohen 

Just as visual window systems let multiple applications share display resources, an audio window system could bring order to the 
cacophony of multiple simultaneous audio sources. 

PX: Supporting Voice in Workstations 

Ragui Kamel, Kamyar Emami, and Robert Eckert 

The Personal Exchange (PX) research project explores an architecture to provide personal workstation users with dexterity in 
manipulating voice. This article describes PX concepts and an initial implementation of the architecture. 


Circulation: Computer (ISSN 0018-9162) is published monthly by the IEEE Computer Society, 10662 Los Vaqueros Circle, PO Box 3014, Los Alamitos, CA 90720- 
1264; phone (714) 821-8380. IEEE Computer Society Headquarters, 1730 Massachusetts Ave. NW, Washington, DC 20036-1903; IEEE Headquarters, 345 East 47th St., 
New York, NY 10017. Annual subscription included in society member dues. Nonmember subscription rates: available upon request. Single copy prices: members $10.00; 
nonmembers $20.00. This magazine is also available in microfiche form. 

Postmaster: Send undelivered copies and address changes to Computer, IEEE Service Center, 445 Hoes Lane, Piscataway, NJ 08855. Second class postage is paid at New 
York, New York, and at additional mailing offices. 

Copyright and reprint permission: Copyright © 1990 by the Institute of Electrical and Electronics Engineers, Inc. All rights reserved. Abstracting is permitted with 
credit to the source. Libraries are permitted to photocopy beyond the limits of US copyright law for private use of patrons: (1) those post-1977 articles that carry a code at 
the bottom of the first page, provided the per-copy fee indicated in the code is paid through the Copyright Clearance Center, 29 Congress Street, Salem, MA 01970; (2) pre- 
1978 articles without fee. Instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. For other copying, reprint, or republication 
permission, write to Permissions Editor, Computer, 10662 Los Vaqueros Circle, PO Box 3014, Los Alamitos, CA 90720-1264. 

Editorial: Unless otherwise stated, bylined articles, as well as product and service descriptions, reflect the author’s or firm’s opinion. Inclusion in — 

Computer does not necessarily constitute endorsement by the IEEE or the Computer Society. All submissions are subject to editing for style, clarity. y BPA 


COMPUTER 













6 President’s Message 

Meeting the technology challenge 


ciety News 

recognizes special achievements and service; 
shops, roundtable among activities of VLSI TC; 
red by Computerworldl Smithsonian Awards; 
it in Society’s awards program. 


Cover image: Alexander Torres 

Cover design; Jay Simpson, Design & Direction 


98 IC/Microsystem Announcements 

103 Product Reviews 

Image processing on the Macintosh 

112 Conferences 

Testing computer software; multiple-valued logic; artificial intelligence 

114 Call for Papers/Calendar 
124 Book Reviews/CS Magazines 

Career Opportunities, 110; Computer Society Information, 128; 
Membership Application, 123; Change-of-Address Form, 33; 
Advertiser/Product Index, 96; Reader Service Card, 96A 


In the next issue 

A Specifier’s Introduction to Formal Methods; 
Tango: A Framework and System for Algorithm 
Animation; Project Athena as a Distributed 
Computer System; Multimedia Conferencing on 
Local Area Networks; Multiprocessor 
Performance Measurement Instrumentation 



August 1990 



















©1990 Sun Microsystems, Inc. ®Sun Microsystems and the Sun logo are registered trademarks of Sim Microsystems, Inc. OPEN LOOK is a trademark of AT&T. All other products or services 


THE BEST USE 
IN TOWN. NOW 
SCKEENN 






































R INTERFACE 
PLATING AT A 
EAR YOU. 


The OPEN LOOK ” user interface. 

It's a real hit with independent software 
vendors, in-house developers and end 
users. In fact, over 300 applications are 
in development today. By people like 
Lotus® INFORMIX® Island Graphics® 
Interleaf® and Frame! And it's the most 
popular front end to UNIX! For a 
number of reasons. 

First of all, it makes UNIX easy to use. 
Because there are no complicated UNIX 
commands. It also looks better than any 
other interface. From its icons to its 3D 
elements. And makes users more effi¬ 
cient. For example, our drag and drop 
feature gives them a simple, intuitive 
way to move files around the desktop. 
Our push-pin icon makes it even easier 
to use. And OPEN LOOK gives users 
the same interface across multiple plat¬ 
forms, so they learn it once. And enjoy 
access to a huge range of network 
resources. 

As a developer, you'll see it's also the 
easiest to work with. Because it's part of 
OpenWindows,” a complete develop¬ 
ment environment. With the tools you 
need to create applications faster than 


ever. And ready-made features, like our 
DeskSet ” graphical productivity tools, 
that you can give users right away. 

Of course, the business reasons to 
choose OPEN LOOK are just as strong. 
OPEN LOOK is the standard interface 
of AT&T's UNIX System V.4, so it's 
included at no charge. And it will run on 
over 20 platforms, including DEC! HP,® 
and IBM! Since it's portable across 
multiple platforms, you only write your 
application once. Which saves thou¬ 
sands of man-hours. Finally, with OPEN 
LOOK, you have the full support of 
a company that leads the workstation 
industry in worldwide shipments! 

We've put together a videotape that 
shows you exactly what OPEN LOOK is 
all about. Just call us at 1-800-624-8999 
(ext. 2068), and we'll send you a 
free copy. 

Then find a nice comfortable seat 
close to your screen. Because the closer 
you look, the better we get. 



mentioned are identified by the trademarks or registered trademarks of their respective companies or organizations. ‘Source, International Data Corporation, 1990.36.3% market share. 

Reader Service Number 2 





Computer Society IV A COC A C 

President s IVILOO/Av^t 


Meeting the technology challenge 


F rancisco Sizzi, a 17th-century 
professor of astronomy, report¬ 
edly said “Jupiter’s moons are in¬ 
visible to the naked eye and therefore can 
have no influence on the earth, and there¬ 
fore would be useless, and therefore do 
not exist.” Clearly, we cannot afford this 
attitude where computer-based systems 
are concerned. 

Each day computer and telecommuni¬ 
cations systems touch some aspect of 
our lives — transportation, banking, 
communications, health care, weather 
prediction, and so forth. Governments, 
manufacturers, and many others rely on 
information technology to streamline op¬ 
erations, improve service, and handle the 
ever-accelerating rate of change. 


A t the same time, the complexity 
of systems (both hardware and 
software) is increasing. Diffi¬ 
culties encountered in developing these 
systems lead to delays in implementa¬ 
tion, compromises in capability, and un¬ 
certainties about reliability. 

A 1989 staff study submitted to the 
Committee on Science, Space, and Tech¬ 
nology of the US House of Representa¬ 
tives (“Bugs in the Program: Problems in 
Federal Government Computer Software 
Development and Regulation”) observed 
that computers may pose a threat to public 
health and safety as their use expands in 
areas such as medical instrumentation 
and air traffic control. The report cited an 
investigation into deaths from radiation 
overdoses during medical therapy that 
traced the problem to errors in the com¬ 
puter programs controlling the system. 
Earlier this year, a software “glitch” de¬ 
graded telephone service for many US 
customers. The number of incidents re¬ 
ported grows almost daily. 

Attempting to characterize the soft¬ 
ware part of the problem, the Computer 
Science and Technology Board of the US 
National Research Council held a work¬ 
shop on complex software systems re¬ 
search needs. Position statements in the 
workshop report, “Scaling Up: A Re¬ 
search Agenda for Software Engineer¬ 
ing,” stated: 


There are few human endeavors that are as 
difficult to grasp as a complex program... 
The relations, processes, and purposes of 
the elements are difficult to describe and 


thus difficult to use as construction ele¬ 
ments. Creating tools, methods, or magic to 
solve these difficulties is extremely hard. 
Today it is a truism that we test until out of 
money and time, not because we’ve 
achieved results. 

Recommended actions developed dur¬ 
ing this workshop include 

• Portray the software development 
process more realistically. 

• Study and preserve software artifacts. 

• Develop unifying models, and 
strengthen mathematical and scien¬ 
tific foundations. 

• Codify and disseminate software en¬ 
gineering knowledge. 

• Nurture collaboration among system 
developers and between developers 
and users. 

• Foster practitioner and researcher 
interaction, and legitimize academic 
exploration of large software sys- 

• Glean insights from behavioral and 
managerial science. 

• Develop new research paradigms. 
Many of these recommendations could 
easily be expanded beyond software 
components to address total systems. 

A workshop on computer-based sys¬ 
tems engineering, held this May in Israel, 
prior to IEEE CompEuro 90, led atten¬ 
dees to observe that an explosion in both 
technology and requirements has created 
new complexities for industry, acade¬ 
mia, and government. Participants con¬ 
cluded that the need has emerged for a 
new engineering discipline — one con¬ 
cerned with engineering activities from 
conception through maintenance of com¬ 
puter-based systems. (A detailed report 
of the conclusions from that workshop is 
under development.) 

Fred Brooks said there is “no silver 
bullet” for slaying the software monster 
(“No Silver Bullet,” Computer, Apr. 

1987, pp. 10-19). The same can be said of 
complex systems in general. Yet surely 
there are some basic principles that ap¬ 
ply. A “high level” (that is, nontechnical) 
set that I find appealing emerged from a 
symposium sponsored in late 1989 by the 
US General Accounting Office (“Meet¬ 
ing the Government’s Technology Chal¬ 
lenge,” GAO/IMTEC-90-23). These 
principles strive to provide a framework 


for successfully integrating information 
technology into the organization’s busi¬ 
ness. The following list is drawn from 
that report: 

(1) Commitment and vision begin at 
the top. Without clear direction and sup¬ 
port from the top, modernization pro¬ 
grams tend to degenerate into loose col¬ 
lections of independent systems. 

(2) Partnerships can help define the 
vision. These should take place at all lev¬ 
els — but certainly the developers and the 
users of systems must actively partici¬ 
pate on project planning committees. 

This can help to break down the artificial 

barriers between those groups. ( 

(3) Service should be the vision’s cor¬ 
nerstone. Successful use of information 
technology requires understanding the 
needs of the customer (user) and letting 
those needs dictate how technology is 
used. 

(4) A clear, flexible architecture 
should support the vision. That is, there 
should be a clear plan for how all major 
technology elements will fit into the or¬ 
ganization’s overall strategy. 

C learly, improved techniques and 
disciplines for dealing with com¬ 
puter-based, complex systems 
are essential. While there will be much 
debate as to the solutions, it is good that 
the magnitude of the problem is gaining 
wider recognition. 

What can we as technical professionals 
do? For a start, we can put a high priority 
on improving our own work. Staying in¬ 
formed about developments in software 
and systems technology is essential. We 
should adopt and maintain businesslike 
personal disciplines for estimating, plan- j 

ning, and reporting. And finally, we must 
try to make computer-based systems 
more understandable to management and 
the general public. 

The Computer Society sponsors pro¬ 
grams and services that can assist you in 
these efforts: tutorials, publications, 
conferences, workshops, standards 
working groups, and technical commit¬ 
tees. Where they meet your needs, I hope 
you will take advantage of them. Where 
our programs fall short, please tell us. 

Helen M. Wood 
Computer Society president 


COMPUTER 







Do you practice Yourdon/DeMarco Structured Analysis? 
Do you need to produce high quality DFD’s? 

Do you need a tool that fits your budget? 


Then MacBubbles™ was made foryou! 


MacBubbles supports the process of structured analysis with: 


Facilities for creating, modifying, leveling and expanding DFD's 
A Data Dictionary that maintains and lists full where-used information 
Automated balancing checks for DFD's and minispecs 


MacBubbles improves productivity: 
Easy to learn 
Easy to use 
Responsive 


MacBubbles is economical: 
Single copy price $779.99 

Multi-copy discount available 
Demo disk $25.00 



MacBubbles requires: 

A Macintosh Plus or SE 

Two floppy drives or a hard disk 

A LaserWriter for high quality output 



Silver Spring, MD 20902-3619 
(301) 946-0522 


Reader Service Number 3 


























Guest Editor’s Introduction 



Voice in 
Computing 


Ragui Kamel, Bell-Northern Research 


O ver the past 10 years, the com¬ 
puter has evolved from a tool for 
computation to a tool for commu¬ 
nication. This includes everything from 
writing documents to preparing presenta¬ 
tions, and from sharing files across a LAN 
to sending electronic mail across an ocean. 
Despite this new role, computers have had 
little integration with the most common 
form of human communication: voice. 

This has resulted partly because the 
world continues to be split between “tele¬ 
com” and “computing” practitioners. 
While it is true that computing technology 
has played a major part in the evolution of 
telephone switching, telecom practitioners 
have done little to extend voice communi¬ 
cation capabilities to computers. On the 
other hand, computing practitioners have 
done little to exploit the ready availability 
of voice communications or to use voice as 
a computer interface medium. On many 
people’s desks, the computer and tele¬ 
phone sit side by side yet isolated from 
each other. 

The purpose of this special issue is to 
take stock of voice in computing by assess¬ 
ing the state of the technology, examining 
typical applications, and discussing direc¬ 
tions in integrating voice with computing. 
The issue features a blend of telecom and 
computing articles. I hope one of the pay¬ 
offs will be a shared perspective by tele¬ 
com and computing practitioners on the 


value, technology, and issues of integrat¬ 
ing voice communication and computing. 
During the refereeing process, I was struck 
by how often an article from one field got 
average reviews from its community but 
rave reviews from the other — a clear sign 
of different perspectives. 

I view the integration of voice and 
computing as consisting of three parts: 
telephony control, voice recording and 
playback, and speech processing. 

Telephony control 

Telephony control concerns the use of 
computers to control the features provided 
by a telephone switch. Many gains in call 
management and answering are linked to 
this functionality: 

• A user can search across multiple di¬ 
rectories and initiate a call by clicking on a 
name. This capability extends to tele¬ 
marketing applications where an agent 
automatically dials through a list of poten¬ 
tial clients and the result of the call (call 
again, not interested, send brochure ...) is 
recorded and processed. 

• A user can access a telephony feature, 
such as call forwarding or setting up a 
conference call, by using a graphical/ 
iconic interface from his or her worksta¬ 
tion screen. 

• If caller identification is available, a 


user can obtain database records related to 
the call. It is also possible to automatically 
route the call to the appropriate person and 
to time the call for automatic billing pur¬ 
poses (for example, a lawyer’s office). 

There are several advances that provide 
computers with the capability to control 
telephony. Telecom vendors such as 
AT&T, Northern Telecom, and Mitel pro¬ 
vide a signaling interface to their private 
branch exchanges (PBXs). Over this inter¬ 
face, typically a serial link, a computer can 
send signaling messages to access the fea¬ 
tures of the switch. The switch uses the 
same interface to provide information on 
telephony events, such as incoming calls. 

From the computing side, computer 
manufacturers such as Digital Equipment, 
Unisys, and IBM provide access to these 
signaling interfaces. However, to isolate 
applications from the specific switch inter¬ 
faces and to provide more abstract call 
management functions, these vendors of¬ 
fer call-management application toolkits 
as libraries in their systems. 

The article by Carl Strathmeyer pro¬ 
vides a good overview of the state of 
computer telephony control with particu¬ 
lar emphasis on the state of standardization 
of various interfaces, problems related to 
signaling control, and the issues of design¬ 
ing computer resource models for provid¬ 
ing these capabilities. 


B-9162/90/0800-0008$01.00 © 1990 IEEE 


COMPUTER 







Voice recording and 
playback 

The most glaring deficiency in most of 
the existing telephony control capabilities 
is their focus on signaling; they do not 
provide the computer with the ability to 
access the voice channel associated with 
making or receiving a call. Having the 
ability to record or play some voice on this 
channel opens new possibilities, such as: 

• Workstation answering machines that 
play voice greetings and record messages. 
Voice mail is a natural extension. 

• Documents that combine voice, data, 
and graphics; for example, using voice as a 
way to provide comments on a document. 

• More sophisticated forms of tele¬ 
marketing, where not only are numbers 
automatically dialed but a prerecorded 
sales pitch is played and the user response 
recorded. Voice response systems in which 
a user interacts with a computer over a 
telephone are a variation on this theme. 
The typical mode of interaction is for the 
computer to prompt with voice and for 
users to respond using dual-tone multifre¬ 
quency (DTMF) phone buttons. 

Speech processing 

While prerecorded voice and DTMF 
make for a useful remote access capability, 
they suffer from several shortcomings. 
Prerecorded voice works where the 
amount of voice is small, but becomes 
unmanageable in cases requiring varied 
data, particularly of a textual nature. In the 
other direction, DTMF is certainly not 
ubiquitously available, and the limited 
keyboard of a telephone makes the entry of 
nonnumeric data awkward. 

This situation can be helped with 
speech-processing technology: text-to- 
speech synthesis and speech recognition. 
This issue contains two introductory ar¬ 
ticles on these topics. Michael O’Malley 
describes the problems of text to speech 
and the structure and evolution of a spe¬ 
cific text-to-speech approach. Dick Pea- 
cocke and Daryl Graf present a taxonomy 
of speech-recognition technology and an 
assessment of the state of the art. 

The next two articles in the issue illus¬ 
trate how voice recording and playback 
and speech processing combine to build 
useful applications. Matt Lennig shows 
how speech has been used to automate 
some operator functions, such as collect 
calls, in the telephone network. Ryohei 


Nakatsu describes voice-response applica¬ 
tion to the Japanese banking industry. 


Speech in user 
interfaces 

Another facet of speech recognition is 
its use as an input medium in user inter¬ 
faces. While speech undoubtedly serves as 
a valuable input medium for hands-eyes 
busy situations, it is less clear whether 
speech input is helpful as a means of input 
to workstations. 

The article by Chris Schmandt et al. 
describes a system where voice was inte¬ 
grated with the X Window System and 
provides some early feedback on experi¬ 
ence with such an arrangement. One of the 
key points in this article is that speech 
recognition has not matured enough for 
general use in an office environment. Good 
acoustic conditions are required, and even 
then recognition errors occur and are diffi¬ 
cult to deal with. 

One way to improve the quality of rec¬ 
ognition is to use domain knowledge in the 
speech-recognition interface. The article 
by Mark Salisbury et al. describes how this 
is done in a control system for the Airborne 
Warning and Command System. 

Another issue in the use of voice-in-user 
interfaces is how to deal with multiple 
voice outputs from different applications. 
The obvious approach is to silence all voice 
that does not pertain to the currently active 
window. However, this often causes no¬ 
table events to go unnoticed. 

A different approach is to play all the 
voice streams but use speech-processing 
techniques to highlight the interesting 
ones. A cocktail party is a good analogy, 
where the user focuses on one conversa¬ 
tion while several others occur in the back¬ 
ground. The article by Lester Ludwig et al. 
describes such an interface. 

Personal voice 

Most current voice systems are designed 
as special-purpose equipment, often acting 
as servers. This is natural because voice 
requires relatively heavy computing. Re¬ 
cently, personal computer hardware has 
become powerful enough to accommodate 
most telephony control and voice record¬ 
ing and playback operations. 

The trend is for workstations to also 
provide digital signal processing as a stan¬ 
dard component. This makes for an envi¬ 
ronment where personal voice applications 


can be provided on the desktop. Several 
research systems have demonstrated that 
providing the infrastructure for personal 
computer voice stimulates new application 
ideas but also raises architectural ques¬ 
tions. The architecture and uses of one 
such personal system are described in the 
article by Ragui Kamel et al. 


O ne of the problems with a special 
issue such as this is breadth versus 
depth. Many of the topics the is¬ 
sue encompasses deserve a special issue in 
themselves. I decided to provide a fairly 
broad coverage of the field, at the expense 
of in-depth discussion of particular topics. 
This broad coverage should provide read¬ 
ers with a good perspective on the technol¬ 
ogy, value, and directions of voice in 
computing. ■ 


Acknowledgments 

I want to thank the many people who submit¬ 
ted articles or refereed them for this special 
issue. I am also indebted to Bruce Shriver for 
his guidance in organizing the issue and for 
obtaining numerous reviews of the articles. 
Most of all, I want to thank Susan Heffernan, 
whose organizing skills and ability to pleas¬ 
antly yet firmly encourage people to meet dead¬ 
lines made this issue possible. 



Ragui Kamel is program director of the Bell- 
Northern Research Computing Research Labo¬ 
ratory (CRL). His technical interests include 
the integration of voice and computing, high- 
bandwidth data and video communication, and 
the impact of software technologies on commu¬ 
nication systems. He has worked in various 
areas at BNR, including programming language 
design, distributed operating systems, and 
computing system architecture. Several of his 
articles in these areas have been published. 

Kamel received his BS in 1975 from the 
University of Manchester in England and his 
MS in 1978 from McGill University in Canada. 
He is a member of the IEEE Computer Society 
and IFIP WG2.4. 

Readers can contact Kamel by mail at Bell- 
Northern Research, Computing Research Labo¬ 
ratory, PO Box 3511, Station C, Ottawa, On¬ 
tario, Canada, K1Y 4H7 or by electronic mail at 
ragui@bnr.ca. 


August 1990 






Voice in Computing: 
An Overview of 
Available Technologies 


Carl R. Strathmeyer 
Digital Equipment Corporation 


A fter decades of separate and par¬ 
allel development, the disciplines 
of voice communication and 
computing are finally interacting. The 
result is a new collection of technologies 
that suggest an exciting range of possible 
applications. 

Interaction between voice and comput¬ 
ing can take many forms, which for too 
long have been lumped under the general 
heading of “voice/data integration.” For 
many readers, voice/data integration sim¬ 
ply means that several digital information 
streams — some representing voice con¬ 
tent and some containing data—have been 
multiplexed into a single physical channel. 
Actually, the range of available technolo¬ 
gies supporting the interaction of voice 
and computing is much richer than this. 

The absence of cross-fertilizing dia¬ 
logue between the voice and computing 
disciplines has hampered awareness of 
these technologies, and hence the discov¬ 
ery of rewarding applications. Though 
voice/data integration has been a promi¬ 
nent issue for much of the past decade, few 
practitioners in either discipline have a 
good working knowledge of the available 
technologies or their possible applications. 

Confusing overlaps and contradictions 
of language make cross-disciplinary dis¬ 
cussion more difficult than it first appears. 
For example, the two disciplines use the 
term terminal equipment in very different 
ways. To a voice practitioner, any sub- 


Barriers between 
voice and computing 
are falling. As 
practitioners come 
to understand the 
technologies at the 
intersection of these 
disciplines, new 
applications are sure 
to appear. 


scriber equipment that plugs into the pub¬ 
lic network (including a large time-sharing 
computer system) is a piece of terminal 
equipment. To the computing practitioner, 
the term is much narrower, indicating a 
keyboard-oriented input/output device. 

This article attempts to bridge these 
differences and foster cross-disciplinary 
dialogue by presenting an overview of the 
technologies useful in building voice/ 
computing applications, along with some 
examples to illustrate how they can be 
used. 


Technology overview 

For convenience, the voice technologies 
relevant to computing can be grouped into 
three general categories that will be ex¬ 
plored in more detail later: 

(1) Content processing, the manipula¬ 
tion and analysis of the payload of a 
voice channel, includes voice chan¬ 
nel digitization, digital signal pro¬ 
cessing, text-to-speech synthesis, 
and speech recognition. 

(2) Connection control, the setup and 
manipulation of connections be¬ 
tween voice equipment, includes 
telephone signaling arrangements 
and point-to-point command links. 

(3) Software architectures, the organi¬ 
zation of computing system soft¬ 
ware to facilitate the creation of 
voice-related applications, includes 
the abstract modeling of voice re¬ 
sources and distributed access to 
voice resources. 

Content processing. Content process¬ 
ing is the creation, manipulation, and 
analysis of the information appearing in a 
voice channel. The family of content-pro¬ 
cessing technologies is probably the first 
that comes to mind when we mention the 
topic of voice in computing. Of these, 
digital encoding of voice channels is the 
most familiar. 


10 


8-9162/90/0800-0010$01.00 © 


COMPUTER 












A pulse code modulation primer 

Basic PCM 

The basic linear PCM technique encodes a waveform by defining equally dis¬ 
tributed amplitude values and assigning each value to a binary code. The input 
waveform is sampled at regular intervals, and the instantaneous amplitude is 
converted to the appropriate code. An n-bit code can represent 2" different am¬ 
plitude values. 

Nonlinear PCM 

PCM performance can be improved for typical voice waveforms by assigning 
amplitude value points nonlinearly. North American (p-law) and European (A- 
law) telephony standards, for example, use logarithmic assignments in which 
codes are assigned more sparsely at the maximum amplitudes and more 
densely near the zero-crossing point. 

Adaptive PCM 

Adaptive PCM techniques improve performance still further by using a vari¬ 
able scaling factor applied to the amplitude assignments. The range of ampli¬ 
tude values having assigned codes is thus adjusted to the expected range of in¬ 
put amplitudes. Various adaptive techniques implement this scaling adjustment 
on a sample-by-sample, group-of-samples, or longer basis. 

Differential PCM 

For voice waveforms, the range of instantaneous slopes is usually smaller 
than the range of instantaneous amplitudes. Performance can thus be improved 
by encoding the signal's differential rather than the signal itself. The differential 
signal can be developed with either analog techniques (before coding) or digital 
techniques (after coding). 


Voice channel digitization. The reduc¬ 
tion of an analog voice channel to a digital 
representation is an essential element of 
modem digital telephony, and telephony 
applications have probably done more than 
any other to advance the digitization art. 
Most modern telephone switching equip¬ 
ment works on the principle of reducing 
voice channels to digital bit streams. 

Standard telephony practice uses 8-bit 
pulse code modulation (PCM) and pro¬ 
duces a digital data stream at 64 kilobits 
per second. This straightforward approach 
is sufficient to represent accurately any 
analog waveform appearing on a standard 
telephony channel. More sophisticated 
encoding techniques can either reduce the 
bit rate at a similar voice quality (for ex¬ 
ample, adaptive differential pulse code 
modulation) or increase the quality of 
reproduction at the same bit rate (for ex¬ 
ample, 7-kilohertz bandwidth audio). In¬ 
expensive single-chip implementations of 
PCM algorithms, called codecs (coder-de¬ 
coders), are available from many sources. 
The accompanying sidebar briefly ex¬ 
plains common PCM techniques. 

With current technology, “toll-quality” 
speech can be maintained while encoding 
waveforms at rates as low as 16 kilobits per 
second. If we drop much below this rate, 
degradation becomes apparent, although 
the quality may still be acceptable for many 
applications. The nature of the degradation 
varies with the encoding scheme used; it 
may include loss of high-frequency re¬ 
sponse, dynamic range, or distinguishing 
voice timbre. Below about 5 kilobits per 
second, waveform encoding becomes use¬ 
less. Parametric encoding, in which signal 
parameters rather than a waveform are 
encoded, is a more useful technique in this 
range. 

The benefits of more compact encoding 
methods can be striking — particularly 
regarding the amount of storage space 
required. A voice message encoded with a 
16-kilobit-per-second method, for ex¬ 
ample, will require only one quarter of the 
storage space required by a 64-kilobit-per- 
second method. Such savings can be cru¬ 
cial in certain applications. However, this 
advantage is gained at the expense of in¬ 
creased computation resources to encode 
and decode the signal, as well as the inevi¬ 
table degradation introduced at lower bit 
rates. Papamichalis 1 provides a compre¬ 
hensive tutorial on practical voice-encod¬ 
ing techniques. 

Digital signal processing. Mathemati¬ 
cal transformations can be applied to a 


digital signal with an effect similar to that 
of passing the original analog signal 
through a series of filters. In many cases, 
signal processing functions can be 
achieved more easily and more accurately 
with digital techniques than would be 
possible using analog components. As an 
additional benefit, we can change the digi¬ 
tally simulated filter’s operational param¬ 
eters without having to replace or adjust 
filter components. 

Digital signal processing techniques are 
used to advantage in many applications. 
Representative examples include signal 
analysis in modems, encryption equip¬ 
ment, telephone dialing and call-progress 
tone sensors, and speech synthesis and 
recognition systems. 

In all of these applications, the ability to 
upgrade a device’s signal processing capa¬ 
bilities by simply adding new transforma¬ 
tion software is a major advantage because 
it preserves the customer’s investment in 
signal processing hardware. Software- 
driven signal processing is so useful, in 
fact, that basic signal processing capabili¬ 
ties are beginning to appear on ordinary 
computing workstations such as the Next 
machine. 


Text-to-speech synthesis. Text-to- 
speech synthesis is the capability of con¬ 
verting a text stream into comprehensible 
human speech. It is steadily becoming an 
important component of voice/computing 
applications. 

Early commercial applications that re¬ 
quired speech output produced the speech 
by concatenating an appropriate sequence 
of prerecorded words, syllables, or pho¬ 
nemes. The result was not truly synthe¬ 
sized speech but rather the intelligent re¬ 
arrangement of existing speech utterances. 
This approach was most useful for applica¬ 
tions requiring a relatively narrow output 
vocabulary, such as telephone directory 
assistance systems, which pronounce only 
limited phrases and telephone numbers. 
Even here, voice inflection was a problem: 
Some speech fragments had to be recorded 
several times with different inflections, 
and the application system had to select the 
appropriate version, depending on the 
context. 

Speech synthesis systems are now avail¬ 
able that operate directly from an input text 
stream to produce a synthesized output. 
These systems vary widely in sophistica¬ 
tion. Low-end devices require the applica- 


August 1990 










Attendant 


Figure 1. A point-to-point command link. 


tion to specify phonemes and inflections; 
high-end devices can convert unannotated 
English text into properly inflected speech. 
Some devices even analyze linguistic 
context to make pronunciation decisions 
(for example, “St.” may be pronounced 
saint or street), and they apply ethnic pro¬ 
nunciation rules, depending on the appar¬ 
ent nationality of a surname. These top-of- 
the-line devices are particularly useful in 
applications that must pronounce arbitrary 
database information rather than fixed 
phrases. 

A brief tutorial on speech synthesis by 
O’Malley appears elsewhere in this issue. 

Speech recognition. Speech recognition 
is the capability of recognizing spoken 
utterances from a given vocabulary set. In 
many computing applications, a user finds 
it inconvenient or impossible to use tradi¬ 
tional keyboards to interact with the sys¬ 
tem. Examples include workers wearing 
protective clothing in industrial environ¬ 
ments, users whose hands are occupied 
with other tasks, and handicapped users 
who lack the manual dexterity required by 
traditional keyboards. Speech recognition 
is an attractive alternative in these situ¬ 
ations, even though the technology still has 
significant limitations of accuracy and 
vocabulary size. 

Speech recognition through the tele¬ 
phone system is particularly useful, since 
hundreds of millions of telephones are in 


use today. Equipped with speech recogni¬ 
tion and synthesis equipment, a computing 
application can use these telephones as 
input/output devices, making all telephone 
subscribers potential users. 

An overview of speech recognition by 
Peacocke and Graf appears elsewhere in 
this issue. 

Connection control. Connection control 
is the arrangement of voice channels to 
interconnect users and voice equipment. 
Although often overlooked, this aspect of 
voice in computing is every bit as essential 
an element of voice-related applications as 
the content-processing functions dis¬ 
cussed in the previous section. 

Since sophisticated voice-processing 
equipment is complex and expensive, it 
must often be shared across applications, 
systems, or users. The application must 
then control how the various users are 
connected to the voice equipment. This is 
done with switching equipment, which in 
most applications means telephone switch¬ 
ing equipment. 

Telephone signaling. Until the advent of 
the Integrated Services Digital Network 
(ISDN), telephone subscribers controlled 
telephone connections through analog in- 
band signaling arrangements. (The term 
in-band means that the signaling informa¬ 
tion travels in the subscriber’s voice chan¬ 
nel.) With this arrangement the subscriber 


requests connections by sending prear¬ 
ranged analog signals. These might be a 
pattern of clicks generated by a rotary dial 
mechanism, or a series of paired tones 
generated by a numeric pad (dual-tone 
multifrequency, or DTMF). Information 
about a call’s progress is reported back to 
the subscriber using other analog signals, 
such as ringing or busy tones, also sent on 
the subscriber’s voice channel. 

Some of these in-band techniques are 
rather complex and can deliver a lot of call 
information. Existing telephone services 
such as direct inward dialing (DID) and 
dialed number information service (DNIS) 
are good examples. In both cases, calls 
dialed to any one of a group of numbers are 
delivered through a common group of cir¬ 
cuits. In-band signaling provides informa¬ 
tion about each call so that it can be routed 
appropriately on the subscriber’s premises. 

More sophisticated network services, 
however, require a wider range of signal¬ 
ing commands and foolproof signal inter¬ 
pretation. This is true both within the tele¬ 
phone network (as the network relays call- 
setup requests between its internal compo¬ 
nents) and between the network and its 
subscribers (as they begin to use more 
sophisticated services and to install more 
advanced subscriber equipment). These 
needs are met by adopting digital signaling 
techniques in which a data transaction is 
sent on a separate channel reserved for the 
purpose. Because the technique uses a 
separate channel outside the subscriber 
voice channels, it is called out-of-band or 
common-channel signaling. 

Two kinds of digital out-of-band signal¬ 
ing are being implemented in telephone 
networks today: 

• Within telephone networks, informa¬ 
tion is exchanged using special telephone 
signaling protocols such as Signaling Sys¬ 
tem #7. SS#7 protocols are usually imple¬ 
mented as a complete signaling subnet¬ 
work that allows any network element to 
exchange signaling information with any 
other through special signaling channels 
and packet switches. 

• At the boundary between the network 
and the subscriber, out-of-band signaling 
is provided by new configurations of sub¬ 
scriber access lines — for example, the 
separate signaling channel and related sig¬ 
naling transactions defined by CCITT 
(International Consultative Committee for 
Telegraph and Telephone) recommenda¬ 
tions Q.931 and Q.932. 

Keiser and Strange 2 give an excellent 
overview of telephone technology includ- 


12 


COMPUTER 

























Typical command link functions 

A typical point-to-point command link might include the following capabilities: 

Make call: 

Initiate a connection. 

Transfer call: 

Move a connection from one telephone set to another. 

Answer call: 

Allow a call to be completed to a telephone set. 

Disconnect call: 

Terminate an in-progress telephone call. 

Join existing calls: 

Create a connection between two or more calls already in progress. 

Place call on hold: 

Temporarily disconnect a call from a telephone set without terminating the call. 

Retrieve call from hold: 

Reconnect a held call to the original telephone set. 

Create conference bridge: 

Prepare to build a call having more than the usual two parties. 

Report call progress: 

Inform the application of call-processing events, such as party disconnect. 

Report station status: 

Inform the application of the state of a telephone set, such as off-hook. 


ing voice digitization, signaling, and the 
evolution to ISDN, as well as a comprehen¬ 
sive glossary. Rutkowski 3 and Stallings 4 
provide more detailed information on 
ISDN. 

First-party versus third-party signal¬ 
ing. The vast majority of subscriber signal¬ 
ing arrangements offered by the public 
telephone network today are suitable only 
for first-party signaling, that is, for service 
requests intended for the same line on 
which the signaling is done. This is the 
normal mode for the public telephone net¬ 
work; a subscriber dialing a call assumes 
that it will be completed on the instrument 
from which it is dialed and not from an¬ 
other instrument. 

A telephone operator, on the other hand, 
often executes third-party signaling func¬ 
tions. Operators can request that calls be 
completed on an instrument other than 
their own, or request that a call already in 
progress between two other subscribers be 
intercepted. This is called third-party sig¬ 
naling because the operator is neither of 
the two principal parties to the call. 

When a computer replaces a human 
operator, the application frequently re¬ 
quires third-party signaling, since a single 
central application must control connec¬ 
tions between a number of subscribers and 
pieces of voice equipment. For example, 
the computing system may have to assist 
the telephone switching equipment in de¬ 
termining the correct line to which the call 
should be connected. Clearly this determi¬ 
nation must be made before the call reaches 
the first subscriber station, so first-party 
signaling cannot be used. A direct connec¬ 
tion to the switch via a third-party signal¬ 
ing arrangement is needed. 


Point-to-point command links. One 
common method of implementing third- 
party signaling is for the switch to provide 
a special point-to-point command link in¬ 
terconnecting the switch and the comput¬ 
ing equipment. The link allows the two 
environments to exchange requests and 
status information, and provides the means 
by which their respective voice and com¬ 
puting functions can be coordinated. Fig¬ 
ure 1 shows this configuration schemati¬ 
cally. 

Public telephone networks currently 
support third-party signaling with a few 
special-purpose command links such as 
Bell Communications Research’s Simpli¬ 
fied Message Desk Interface (SMDI). 5 
Private branch exchanges (PBXs) have 
been quicker to implement third-party 
signaling, although the extent of their 
support for switching functions via these 
command links varies widely. Some manu¬ 
facturers offer simple one-way links that 
announce only the arrival of calls, while 
others offer more complete two-way links 
allowing full integration of switching and 
computing application functions. The 
sidebar above shows functions supported 
by a typical command link. 

Commercial command link specifica¬ 
tions include Northern Telecom’s ISDN 
Applications Protocol, AT&T’s Adjunct/ 
Switch Application Interface, 6 and Mitel’s 
Host-Command Interface. 

The most active public forums for the 
discussion of third-party signaling stan¬ 
dards are the European Computer Manu¬ 
facturers’ Association, the American Na¬ 
tional Standards Institute, and the AT&T 
ISDN/DMI Users’ Group. The sidebar on 
the next page provides details on these or¬ 
ganizations. 


Software architectures for voice ap¬ 
plications. Constructing a voice-related 
application involves not only knowledge 
of and access to voice devices, but also the 
orderly arrangement of software to facili¬ 
tate application construction and to ensure 
reliable operation. For this discussion we 
can group all of the content-processing and 
connection control technologies discussed 
earlier into a single category called voice 
resources. 

Two aspects of software architecture are 
particularly relevant to voice-related ap¬ 
plications: abstract models for voice re¬ 
sources, and software architectures for 
distributed access to voice resources. 

Abstract models for voice resources. It 
is possible to write voice-related applica¬ 
tions by directly invoking the content 
control and connection control capabilities 
of specific voice devices. In practice, 
however, an additional layer of abstraction 
is needed to simplify application creation. 
The voice resources must be presented to 
the application programmer in a way that 
insulates the application from the imple¬ 
mentation details of those resources. This 
is the concept of resource modeling. 

Modeling results in an abstract applica¬ 
tion view of a specified domain of voice 
resources. The application programmer 
works within that model and relies on 
system software beneath the application to 
translate the programmer’s logical view of 
resources into physical resource capabili¬ 
ties. This concept is shown in Figure 2, 
which makes clear the distinction between 
the abstract resource interface used by the 
application and the physical interface to a 
specific resource. 

Programmers have long taken this 


August 1990 


13 








Command link standards groups 

ECMA 

The European Computer Manufacturers’ Association Technical Group 32 
Task Group 11 has taken up the question of computer-supported telecommuni¬ 
cations applications. The first edition of the technical report is currently in draft 
form and is expected to be released during 1990. For further information con¬ 
tact ECMA at Rue du Rh6ne 114, CH-1204, Geneva, Switzerland. 

ANSI 

The American National Standards Institute Committee T1S1.1, in its sub¬ 
working group on ISDN networking, has undertaken a project entitled “Switch 
to Computer Application Interface.'' For further information contact the Ex¬ 
change Carrier Standards Association (Secretariat for ANSI), 5430 Grosvenor 
Ln„ Bethesda, MD 20814. 

ISDN/DMI Users' Group 

AT&T sponsors the ISDN/DMI Users’ Group, which includes a special-inter¬ 
est group on ISDN applications and command links. Work in this SIG led to the 
publication of AT&T’s Adjunct/Switch Application Interface specification. 6 For 
further information contact Carl Bronell at (201) 576-2910. 


Abstract resource interface 


Abstract / physical mapping 


Physical resource interface 



Figure 2. Layers of resource abstraction. 


modeling concept for granted with respect 
to most computing resources. An applica¬ 
tion programmer does not expect to be 
concerned with details of the processor on 
which application software runs, the disk 
devices on which application files are 
stored, or the display or printer on which 
the output appears. All these resources are 
modeled by the operating system, and the 
programmer works with virtual resources 
rather than real ones. As a result, software 
can be written easily to operate on a wide 
range of actual hardware systems. 

The same can (and should) be true of 


voice resources available to the applica¬ 
tion programmer. Since any given applica¬ 
tion is likely to represent a greater invest¬ 
ment in application software than in voice 
equipment, it makes sense to write the 
application in an environment where new 
voice implementations can be incorpo¬ 
rated without requiring substantial soft¬ 
ware modification. This means writing the 
application to manipulate abstract voice 
resources rather than physical resources, 
and relying on system software to bind the 
abstract resources to a range of equivalent 
physical resources at execution time. For 


example, an application programmer 
should be able to issue an abstract com¬ 
mand to place an outgoing call rather than 
having to implement the protocol needed 
to issue such a request to a specific tele¬ 
phone switch. 

Several computing system vendors are 
making good progress implementing ab¬ 
stract models of voice resources. Examples 
include Wang’s Speech and Telephony 
Environment for Programmers, Digital 
Equipment’s Computer Integrated Tele¬ 
phony, and IBM’s Telephony Applications 
Services and Callpath. 


Standardizing voice resource models. 
Given the importance of abstract voice 
resource models, standards for the abstract 
resources are perhaps even more important 
than standards for physical resources such 
as command or signaling-link protocols. 

The abstract model’s importance has 
been borne out in the standards discussions 
cited earlier. Each working group learned 
quickly that this modeling work had to be 
completed before detailed work on proto¬ 
cols could begin. The anticipated work 
product from the standards effort is not 
simply a protocol for physical command 
links but, more importantly, an abstract 
model of switching services available to 
application programmers. 


Distributed access to voice resources. 
Voice-related applications are likely to 
require distributed access to voice re¬ 
sources. This need arises because voice 
resources such as switches are, by their 
nature, centralized. But the applications 
needing to interact with those resources 
are very likely distributed across multiple 
computers or workstations. 

This requires a server-style distributed 
processing architecture that can share 
voice resources as needed across applica¬ 
tions without relying on application writ¬ 
ers to arbitrate resource demands. 

In such an environment the abstract 
resource interface in Figure 2 is present on 
each application node, and physical re¬ 
source interfaces are present on each voice 
resource server. Applications are not 
aware of the complexity introduced by the 
distributed environment, because they 
continue to use the same abstract resource 
interface as before. 

While the necessity of distributed server 
support for voice resources may seem 
obvious, the level of support for this capa¬ 
bility varies markedly among computing 
vendors. 


14 


COMPUTER 



























Combinations of 
technology: An 
application example 

Interesting voice/data applications usu¬ 
ally combine voice technologies. Some of 
these applications are listed in the accom¬ 
panying sidebar. 

For example, consider a customer ser¬ 
vice application for a financial institution. 
Incoming calls are greeted by digitized 
voice playback announcing the caller’s 
options for service. Voice or DTMF (dual¬ 
tone multifrequency) recognition equip¬ 
ment then determines what kind of service 
the caller is requesting and a suitable call 
destination is computed. A command link 
between the telephone switch and the 
computer system allows the application to 
redirect the call to a voice synthesizer, 
which reads database information back to 
the caller, or to recognition equipment that 
allows the caller to enter a transaction 
using the telephone numeric pad or normal 
speech. Throughout the process, the caller 
might have optional access to a human 
attendant supported by a data terminal 
connected to the same application. 

A cademic and commercial interest 
in the possibilities of voice and 
computing is increasing, and the 
barriers between the disciplines mentioned 
early in this article are gradually being 
dismantled. Still, historical boundaries 
between voice and computing technolo¬ 
gies are evident, and most products and 
applications draw a clear distinction be¬ 
tween the computing and voice portions of 
the solution. 

Researchers have been investigating 
what a seamlessly integrated voice and 
computing environment might look like 
and whether it would offer capabilities not 
possible with today’s bifurcated environ¬ 
ments. Schmandt and McKenna, 7 Zell¬ 
weger et al., 8 and Herman et al. 9 describe 
some of this work and the opportunities 
that further integration might bring. ■ 

References 

1. P. Papamichalis, Practical Approaches to 
Speech Coding, Prentice-Hall, Englewood 
Cliffs, N.J., 1987. 

2. B. Keiser and E. Strange, Digital Telephony 
and Network Integration, Van Nostrand 
Reinhold, New York, 1985. 

3. A. Rutkowski, Integrated Services Digital 
Networks, Artech House, Dedham, Mass., 
1985. 


Example voice/data applications 

Customer service 

• Using calling-number identification to display relevant customer records 
automatically 

• Transferring customer record context to a new agent along with the 

incoming call 

Technical-help desk 

• Using dialed-number information to direct callers to appropriate 
databases 

• Using voice and DTMF recognition, along with voice synthesis, to 
provide unattended readback of database information 

Telephone operator services 

• Automatically placing and dialing-back calls 

• Implementing credit card, collect call, and charge quotation services 
without human operators 

Message desk 

• Implementing voice-mail services 

• Simplifying transcription of telephone messages by providing attendant 
with automated call information 

• Reading back text-mail messages with synthesized voice 

Hands-free system operation 

• Using voice recognition and voice synthesis as a substitute for keyboard 
and screen interaction 

• Helping handicapped users or users working at tasks that require hands 
and eyes to be directed elsewhere 


4. Integrated Services Digital Networks 
(ISDN), 2nd ed„ W. Stallings, ed„ CS Press, 
Los Alamitos, Calif., Order No. 823, 1988. 

5. Interface Between Customer-Premise 
Equipment: Simplified Message Desk and 
Switching System: lAESS, Document TR- 
TSY-00283, Issue 1, Bell Communications 
Research, Red Bank, N.J., July 1985. 

6. Adjunct!Switch Application Interface 
(ASAI) Specification, Selection No. 555- 
025-203, Issue 1.0, AT&T, Basking Ridge, 
N.J., Dec. 1989. Available from AT&T’s 
Customer Information Center at (800) 432- 
6600. 

7. C. Schmandt and M. McKenna, “An Audio 
and Telephone Server for Multi-Media 
Workstations,” Proc. Second IEEE Conf. 
Computer Workstations, CS Press, Los 
Alamitos, Calif., Order No. 810, 1988, pp. 
150-159. 

8. P. Zellweger, D. Terry, and D. Swinehart, 
“An Overview of the Etherphone System 
and Its Applications,” Proc. Second IEEE 
Conf. Computer Workstations, CS Press, 
Los Alamitos, Calif., Order No. 810, 1988, 
pp. 160-168. This and several related papers 
are also available in Etherphone: Collected 
Papers 1987-1988, Xerox PARC Tech. 
Report CSL-89-2, Palo Alto, Calif., May 
1989. 

9. G. Herman et al., “The Modular Integrated 


Communications Environment (MICE): A 
System for Prototyping and Evaluating 
Communications Services,” Proc. Int’l 
Switching Symp., Phoenix, Ariz., Mar. 
1987. 



Carl R. Strathmeyer is the manager of intelli¬ 
gent networks marketing in Digital Equipment 
Corporation’s Telecommunications Industry 
Marketing Group. He is responsible for deter¬ 
mining and executing DEC’S marketing strat¬ 
egy for intelligent networks and related appli¬ 
cation areas. During his 12 years with the firm, 
he has held management and technical posi¬ 
tions in telecommunications industry market¬ 
ing, computer network engineering, corporate 
telecommunications, and business systems 

Strathmeyer holds a BA in mathematics and 
computer science from Dartmouth College. 

The author can be contacted at Digital Equip¬ 
ment Corporation, Telecommunications Indus¬ 
try Marketing, 6 Tech Drive, Andover, MA 
01810. 


August 1990 


15 











The Japanese believe 
in working with the best. 
That’s why they’re working with 
Motorola Cellular. 


When it comes to success in the Japanese marketplace, 
no American company has the record of Motorola Cellular 
In fact, our MicroTAC Personal Telephone just 
won Japan’s 1989 Nikkei Award for creative 
excellence in products and services. 

There’s even more to our partnership with 
Japan.We are the only non-Japanese 
company selected to participate ii 
the development of the Nippon Tele¬ 
graph and Telephone Cor¬ 
poration Cellular radio 
telephone program. 

And we've been asked 
to bring our cutting-edge technology in 
chips, circuitry and electronic systems to the 
development of a cellular phone system for the 
densely populated Tokyo-Nagoya corridor 
So while the world talks about Japanese techno¬ 
logical advancements, Japan is talking with 
Motorola Cellular And you can be part of the 
dialogue in one of the following positions: 



(Opportunities exist for entry-level through management.) 

•Software Engineers (Development, Test¬ 
ing, Quality) ‘Hardware Engineers (Dig¬ 
ital, Analog, RF, Test Equipment) 
^Cellular Systems Engineers 
•Manufacturing Engineers 
•Mechanical Engineers 
•Production Supervisors. 
We offer an attractive salary, 
a comprehensive benefits package 
and opportunities for professional growth. 
For immediate consideration, please send your 
resume to: Supervisor Professional Recruit¬ 
ment Motorola Inc., Cellular 1501 WestShure 
Drive Arlington Heights, IL 60004. Or FAX your 
resume to: (708) 632-5717 (our 24-hour FAX 
line), lo access our On-Line Career Network 
from your PC, dial (508) 263-3857 press 
return twice and key in password LEGACY 
We are an equal opportunity/affirmative 
action employer 



MOTOROLA 

Cellular Subscriber Group 

Radio Telephone Systems Group 

Our breakthroughs are heard around the world. 




Text-To-Speech 
Conversion Technology 


Michael H. O’Malley 
Berkeley Speech Technologies 


T ext-to-speech (TTS) conversion 
transforms linguistic information 
stored as data or text into speech. It 
is widely used in audio reading devices for 
blind people. In the last few years, how¬ 
ever, use of text-to-speech conversion 
technology has grown far beyond the dis¬ 
abled community to become a major ad¬ 
junct to the rapidly growing use of digital 
voice storage for voice mail and voice 
response systems, which provide tele¬ 
phone information access. For example, 
text-to-speech technology can convert 
electronic mail to voice mail for audio 
access by phone. It can also permit field 
personnel to access large text databases, 
like parts inventories, by telephone. 

The rapid expansion of text-to-speech 
technology comes in part from the ad¬ 
vances in delivery methods and speech 
quality over the past 30 years. This article 
discusses the historical and theoretical 
bases of contemporary high-performance 
TTS systems and their current design. 
Because of space limitations, I have drawn 
examples mainly from Berkeley Speech 
Technologies’ proprietary text-to-speech 
system, T-T-S. 

What is TTS? 

Any text-to-speech system consists of 
two major elements. Starting with the out¬ 
put, we need some type of sound-generat¬ 
ing mechanism whose function is analo¬ 
gous to that of the human vocal tract. A 
mouth by itself cannot talk, so we also need 

August 1990 


Reading an English 
text out loud is 
currently the most 
successful simulation 
by a computer of a 
complex human 
mental function. 
This article shows 
how it can be done. 


a module whose input is the text or other 
linguistic information to be spoken and 
whose output drives the sound-generating 
mechanism. In modem technology, both of 
these components are software. We can 
implement them in such a way that they 
can run on many kinds of hardware 
platforms. 

A schematic of the human vocal tract 
appears in Figure 1. Air pressure devel¬ 
oped in the lungs flows through the trachea 
and the vocal folds in the larynx. This 
opening between the vocal folds is called 
the glottis, and the air flow as a function of 
time is called the glottal waveform. 

For voiced sounds, the correct combina¬ 

0018-9162/90/0800-0017$01.00 © 1990 IEEE 


tion of air pressure and muscle tension 
causes the vocal folds to vibrate, generat¬ 
ing a series of pulses of air. For aspirate 
sounds such as /h/, the vocal folds adjust to 
generate turbulent air flow. For many 
sounds, we also generate noise in a higher 
part of the vocal tract. 

Some of the noise in fricative sounds 
(/s/, /z/, /f/, etc.) actually results from tur¬ 
bulent air flowing through a constriction 
between the tongue and the roof of the 
mouth. On the other hand, the main source 
of noise in plosive sounds (/b/, /d/, /k/, etc.) 
is a burst from the sudden release of air 
pressure built up behind a closure of the 
vocal tract, followed by a short period of 
frication. In English and certain other lan¬ 
guages, “voiceless” stops such as /k/ are 
also followed by a period of aspiration 
noise when they occur in particular lin¬ 
guistic environments. 

The noise generated at the glottis or 
elsewhere in the vocal tract is modified as 
it passes through the oral and nasal cavities 
and radiates from the head as a speech 
waveform. As it progresses along the vocal 
tract, some of the flow reflects backwards. 
In fact, some energy even reflects back 
through the glottis toward the lungs. These 
reflections induce resonances (frequencies 
at which the sound is reinforced) and an¬ 
tiresonances (frequencies at which the 
sound is absorbed). 

The frequency of the glottal pulses is 
one of the important parameters character¬ 
izing voiced sounds, corresponding to the 
fundamental frequency or pitch of the 
voice. However, finer details of the vocal 

17 













fold vibrations as well as other elements of. 
the sublaryngeal system also affect the 
overall sound quality. 

Above the larynx, the shape of the 
tongue and lips and the size of the velar 
opening to the nasal cavity are major fac¬ 
tors in determining the resonances of the 
vocal tract and thus the sound of the radi¬ 
ated speech waveform. Here again, even 
an obscure factor, such as the compressi¬ 
bility of the cheeks, can have some effect 
on the perceived sound. 

The major features of the vocal tract and 
how they influence our perception of 
speech have been understood, at least in a 
general way, for years. 12 Because of the 
tremendous complexity of the vocal tract, 
however, the attempt to understand and 
model the human speech production 
mechanism remains an active research 
topic in current acoustic and physiological 
phonetics research. 3 

Our current understanding does permit 
the synthesis of intelligible speech, but our 
models, especially in their dynamic behav¬ 
ior, are not yet adequate to make synthetic 
speech that is indistinguishable from 
human speech. 


Vocal tract models based on the physical 
shape and the physiology of the sound 
production mechanism provide many sci¬ 
entific insights but are rather difficult to 
use. The complex relationship between 
these physical parameters and the resulting 
speech waveform involves nonlinear equa¬ 
tions. A small change in one of the model’s 
parameters often makes a major change in 
the resulting sound. 

Speech can also be modeled in strictly 
acoustic terms as a continuously changing 
spectrum (see Figure 2). The most com¬ 
mon parameterization in this domain is in 
terms of the resonances and antiresonances 
(poles and zeros), familiar from traditional 
engineering models of linear systems. In 
this model, the speech production process 
consists of an excitation source, presum¬ 
ably representing the glottal waveform, 
which drives a “filter,” presumably repre¬ 
senting the oral and nasal cavities. 

The most important acoustic parameters 
for speech synthesis are the fundamental 
frequency of the glottal waveform and the 
frequencies of the first three narrow band¬ 
width resonances, or formants. For a typi¬ 
cal male voice, the fundamental varies over 


about an octave centered around 120 Hz, 
while the first three formants vary around 
500 Hz, 1,500 Hz, and 2,500 Hz, respec¬ 
tively. The fundamental frequency for a 
female voice typically falls closer to 200 
Hz, while the formant frequencies, which 
are inversely proportional to head size, 
measure about 10 percent higher. 

This source/filter model of speech pro¬ 
duction is the most widely used model for 
acoustic phonetic research as well as for 
the most sophisticated applications of 
speech technology. However, it is only one 
of many possible parameterizations of the 
speech production mechanism. 

For example, another model with wide 
applicability approximates the vocal tract 
by a cascade of short cylinders, each of a 
different diameter. This model is mathe¬ 
matically related to the widely used linear 
predictive coding (LPC) technique used in 
speech transmission systems and for re¬ 
cording human speech in compressed form 
on chips. 

This multitube model has the advantage 
that it is easy to fit human speech to the 
model automatically, so we only need to 
transmit the model parameters rather than 
the full speech waveform. However, it is 
very difficult to relate the parameters of 
this multitube model to the usual scientific 
descriptions of speech. Therefore, this 
model has had limited applicability in the 
most sophisticated speech synthesis and 
recognition systems. 

In my work at the University of Michi¬ 
gan in the 1960s, the real-time parametric 
voice module was actually a hardware 
analog filter. The hardware consisted of 
three voltage-controlled filters, a noise 
generator, and a pulse generator connected 
to an IBM 1800 computer with a total of 
only 8 kilowords of memory. 

The filters modeled the resonances of 
the vocal tract and the pulse generator 
modeled the glottal waveform. We entered 
speech parameters by hand into the com¬ 
puter, which then sent them in real time to 
the hardware. By varying the parameters, 
we could study the effect of changing the 
voice model. The system helped in study¬ 
ing human speech production and percep¬ 
tion, but was not sufficiently powerful to 
support the automatic conversion of text 
into speech in real time. 

One major factor in the progress toward 
this goal has been the development of 
digital simulation of the kind of hardware 
used in the early systems. In general, an 
analog resonator can be simulated digitally 
by a very simple program consisting of 
addition, multiplication, and some storage 


COMPUTER 























1 



in. i . r. 1 . 

p 




nr 1 

lliil ft 


1, 

i 

| 

ft 


X 

|j ■1(111 

w 

ill 

i 111 

|, , 

•Ipl 

|l| 


S 


JfiM 

1 ■( 


V 1 

Iff 

,J 

y 

; 

"•hj)| ill 

J| 


\ 

pi 

rif 

i 1 





i , i, • ; < l I 


k.liii 

fta *iir.L 

b * ij 6 9 £ f ^ Q * s n r 


Figure 2. Speech modeled in acoustic terms as a continuous spectrum: “How'd you get home last night?” 


locations. Approximately 20 multiplica¬ 
tions are required in order to generate a 
single speech sample for a typical vocal 
tract model. 

“Telephone quality” speech requires 
8,000 samples per second, although most 
speech researchers prefer a somewhat 
higher bandwidth — 10,000 to 20,000 
samples per second. With the computing 
power available by the 1970s, researchers 
could replace analog human sound produc¬ 
tion models by digital synthesizers in 
almost all research applications. Even 
when they lacked the computing power 
necessary for real-time synthesis, develop¬ 
ing a more powerful, all-digital model and 
running it in batch mode was preferable to 
trying to make do with an inadequate hard¬ 
ware analog of the vocal tract. Thus, for 
more than 10 years most speech synthesiz¬ 
ers have been software. 

In 1978 Texas Instruments developed 
the “Speak & Spell” LPC chip. This chip 
can perform an integer multiply/add in less 
than 5 microseconds and therefore can 


implement a 10-pole lattice filter digitally. 
Unfortunately, this chip implements the 
multitube vocal tract model, with the con¬ 
sequent difficulty in generating synthetic 
parameters. However, because we can 
easily extract these filter coefficients from 
natural human speech, these chips found 
widespread use delivering speech in a 
number of products such as toys and talk¬ 
ing cars. Of course, human voice stored 
and compressed by algorithm is not truly 
“synthetic” in the ordinary sense of “artifi¬ 
cial” or “not of natural origin,” but is more 
properly considered a low-bit-rate record¬ 
ing. True synthesis implies a model of a 
speaker, not an actual speaker. 

During the 1980s, easily programmable 
digital signal processing chips became 
available. These chips, such as the Texas 
Instruments TMS320, could perform a 
multiply/add cycle in less that one micro¬ 
second. This meant that, at least for audio 
frequencies, even the most complex re¬ 
search voice models could be programmed 
to run in real time in a relatively inexpen¬ 


sive system. After this, practical hardware 
or computational limitations on the im¬ 
plementation of a synthesis voice model no 
longer existed. 

However, the voice model constitutes a 
very small part of the text-to-speech syn¬ 
thesis problem. A voice model requires 
from a few hundred to at most a few thou¬ 
sand computer instructions. In contrast, 
the text-to-parameter model for a high- 
quality synthesizer requires from 100 to 
1,000 times as much memory as the voice 
model to capture the extensive linguistic 
knowledge required for true synthesis. 

It is interesting to compare the relative 
computational complexity in current 
speech recognition and speech synthesis 
systems. A speech recognition system 
must have some kind of speech analysis 
module, which functions as an “inverse” of 
the voice model of a speech synthesis sys¬ 
tem. However, this analysis task is more 
difficult than the corresponding synthesis 
because the signal is “real” rather than the 
result of a simplifying model. The better 


August 1990 


















































































speech recognition models are related to 
the synthesis voice model, but they nor¬ 
mally occupy a larger fraction of the avail¬ 
able computer resources. 

A speech recognition system must also 
have a “language model” to guide the map¬ 
ping from the analysis parameters to the 
recognized text. In some sense, this mod¬ 
ule is the inverse of the text-to-parameter 
conversion module in a speech synthesis 
system. 

In actuality, the language module for a 
speech recognition system is often rather 
simple and “algorithmic” rather than 
knowledge intensive. In fact, some recog¬ 
nition language models contain no linguis¬ 
tic information at all. They work with any 
language or even with nonlanguage 
sounds. However, even these simple lan¬ 
guage modules tax the power of current 
computer systems. 

Generally speaking, text-to-speech sys¬ 
tems are limited by our current knowledge 
of linguistics. Speech recognition systems 
are more limited by computing resources 
and by our ability to apply the linguistic 
knowledge available. 

The text-to-parameter conversion mod¬ 
ule takes as input an English text (or, more 
usually, structured information from a 
database) and generates the parameters 
that can then drive a vocal tract model. We 
can think of this module as a model of how 
we convert linguistic information into 
parameters that drive the speech produc¬ 
tion mechanism. In other words, how do 
we read aloud from a printed text? 

Of course, we know much less about 
how we process language than we know 
about how the human vocal tract works. 
However, we know a great deal about the 
structure of language. Linguistics is an old 
science and, especially over the past 30 
years, researchers have developed a num¬ 
ber of models for various parts of lan¬ 
guage. The text-to-parameter module 
might not really provide a good model of 
how we generate speech, but it does incor¬ 
porate a great deal of knowledge about the 
English language. 

The process of converting text into 
speech parameters breaks down into a 
number of stages, as shown in Figure 3. 
The following sections summarize these 
various subprocesses. 

Text normalization 

Actual texts have a great deal of sym¬ 
bolic material such as numbers, abbrevia¬ 
tions, acronyms, and information signaled 


by graphic layout. The first step in a text- 
to-speech conversion system converts 
such information into a standard text for¬ 
mat. The complex conversion process 
involves various types of local parsing. 
While the details lie beyond the scope of 
this article, some examples can illustrate 
the various difficulties. 

First, consider the different ways we can 
read numbers. For example, we can read 
the sequence “415” in three different ways 
depending on whether it is an area code, 
part of an address, or a dollar amount. 
Furthermore, the rules for converting 
numbers into spoken English will differ in 
the United States and in England. 

The pronunciation of various abbrevia¬ 
tions is often determined by context. For 
example, the pronunciation of letter/punc¬ 
tuation mark sequences differs markedly 
depending upon where they occur: 

• Dr. Jones lives on Jones Dr. 

• St. James St. 

• Jan. 22 is my wife's birthday; her name 
is Jan. 

The effect of punctuation marks is also 
often context determined. For example, a 
period at the end of a sentence has a major 
effect on sentence prosody. But a period at 
the end of an abbreviation, in a decimal 
number, or after a middle initial in a name 
has a very different significance and must 
not be misinterpreted as a sentence termi¬ 
nation in pronunciation. Columnar lists of 
items, pronounced as if they ended with a 
period, often do not have any punctuation 
at all. In this case, the text-to-speech sys¬ 
tem must recognize the two-dimensional 
shape of the text to pronounce it correctly. 

All of the above phenomena can be 
handled reasonably successfully, but they 
require an extensive, nonalgorithmic 
computer program. Such a program cap¬ 
tures facts about normal American English 
written text that almost every literate per¬ 
son knows implicitly. It probably does not 
provide a good model of our mental pro¬ 
cessing, but it does model human knowl¬ 
edge in the same sense that a good chess 
program models a chess master’s knowl¬ 
edge of chess. 


Word pronunciation 

How we pronounce English words today 
depends on 2,000 years of history, dozens 
of wars, and many population migrations. 
Because of this, English has by far the most 


complex relationship between spelling and 
word pronunciation of any alphabetic 
language. 

We all learned the rule that a final “e” 
makes the preceding vowel tense — the 
rule that turns “dud” into “dude.” How¬ 
ever, a reasonably accurate text-to-speech 
system requires several thousand such 
rules to handle English. 

Any language with an alphabetic writ¬ 
ing system has rules that convert spelling 
into sound. However, words borrowed 
from another language tend to reflect the 
rules of that language, modified to fit the 
patterns of the new language. The com¬ 
plexity of English spelling derives from its 
diversity of sources. 

For example, the large number of Eng¬ 
lish words borrowed from Romance lan¬ 
guages tend to be stressed on the third 
syllable from the end (“intelligent”). How¬ 
ever, some affixes move the stress location 
(“intelligibility”), while others do not 
(“intelligently”). English speakers often 
stress names and other words of recent 
non-English origin, such as “Dukakis,” on 
the second syllable from the end, regard¬ 
less of their stress in their original 
language. 

You might think that pronunciation 
rules could be replaced with a large dic¬ 
tionary: just put the exact pronunciation of 
each English word in a table and then look 
it up. In fact, a text-to-speech system must 
use a dictionary to pronounce certain ex¬ 
ceptional words. However, the ordinary 
English speaker encounters a large number 
of words and an even larger number of 
names. For example, it has been estimated 
that the average American high school 
student might encounter any one of 
500,000 different words. 4 In the United 
States, the 1970 census listed more than 2 
million different last names, any one of 
which might occur in a typical computer 
database. Furthermore, our culture adds 
new words and new names every day, 
which means that a dictionary would al¬ 
ways be out of date. A large dictionary, by 
itself, cannot provide a solution. 

The word pronunciation module for a 
high-performance text-to-speech system 
must combine a dictionary with sophisti¬ 
cated pronunciation rules. The dictionary 
handles words with a syntactic or context- 
dependent effect, as well as words not 
regular enough to justify a rule. The rules 
then handle the rest of the words, including 
new words and names. Of the several mil¬ 
lion words and names possible, the major¬ 
ity will thus be pronounced correctly by an 
efficient system. 


20 


COMPUTER 










phonemes: 

/ml'stR/ 

10 bytes/second 



Mister? 


Text data made up of English words and numbers is the output of many kinds of 
application programs: word processors, electronic mail systems, databases, optical 
character readers, etc. This forms the initial input for BST T-T-S. We always need 
some kind of application program to create the text T-T-S reads. 


The text normalizer converts everything in the text stream to letters. For example, “1990” 
becomes “nineteen ninety” and “Dr.” becomes either “doctor” or “drive,” as appropriate. 
The current normalizer is quite large and complex. We could use a much simpler one, but 
a normalizer is always needed. Custom normalizers help optimize performance in vertical 
market applications, which often have their own standard abbreviations. 


Some words in English are not pronounced in accordance with the basic rules for English 
pronunciation. The word “of,” for example, is the only case where a final “f ’ is pro¬ 
nounced “v.” To say these words correctly, the system stores a phonemic transcription 
of their exact pronunciation in an exception dictionary. The dictionary is also needed to 
store grammatical information about particular words for use by other modules in the 
system. 


The letter-to-phoneme rules convert English spelling into phoneme transcriptions, which 
are a more exact representation of pronunciation. For example, the word “cat” is 
represented as “k as t.” Grammatically significant endings such as “-ness,” “-ly,” and 
“-ing” are also identified. 


The prosody rules create the intonation pattern, or rhythm, of sentences in the text. All 
text-to-speech systems have somewhat robotic prosodies because the computer, unlike 
human speakers, follows rigid rules. However, end users can insert prosodic “diacritics” 
into a text to get special intonation patterns. 


The “fine tuning” of pronunciation takes place in the phonetic rules. For example, the 
phoneme “t” is pronounced differently in “tom,” “atom,” and “cat.” Users can enter 
phonemes directly into the T-T-S system at this point, bypassing the text-to-phoneme 
modules, for complete control of pronunciation. 

The voice generation module turns phonemes into more than 5,000 smaller speech units, 
which are then converted to voice parameters. T-T-S allows the creation of an unlimited 
number of different voices. Three default “head models” can be extensively modified, 
using text instructions alone, to produce many different character voices. 


The interrupt driver takes frames and sends them to the output hardware at regular 
intervals of from 5 to 25 milliseconds. 


Various kinds of output hardware can be used for the final stages of T-T-S. T-T-S can 
drive the general-purpose digital signal processing chips, such as the Texas Instruments 
TMS320, found in standard voice processing systems. However, it also produces excellent 
results with inexpensive consumer electronics speech chips and can even run without any 
special speech chips. 


Figure 3. The process of converting text into speech parameters in Berkeley Speech Technologies T-T-S system. 


August 1990 


21 














































Prosodies 

The prosodic component of speech in¬ 
volves such phenomena as rhythm, intona¬ 
tion, and the emphasis or de-emphasis of 
particular words. For any given text, a 
human speaker might produce thousands 
of possible prosodic interpretations, but an 
even larger number would not be consid¬ 
ered natural under reasonable circum¬ 
stances. 

As you listen to speech, certain words 
will seem more prominent than others. 
Often, these words carry the information 
focus of the message. On the other hand, 
such word stress patterns might simply 
signal that the word is part of a multiword 
compound: “Jones Street”; “history 
teacher.” 

Speech usually divides into a number of 
shorter phrases. Some of the division 
points will be accompanied by actual si¬ 
lence, while some will be marked only by 
an apparent slowing. However, each 
phrase will contain at least one of the 
emphasized words. 

In addition to emphasized words and 
division into phrases, speech has an into¬ 
nation contour. For example, some of the 
phrases typically end either with a falling 
pitch or with a fall followed by a low rise 
— a “period” or a “comma” intonation. 

If part of a conversation, speech will 
exhibit quite elaborate prosodic phenom¬ 
ena that might carry a significant portion of 
the message’s meaning. But speech as read 
from a text, especially by a person neither 
interested in nor knowledgeable about the 
text, will provide much more regular pro¬ 
sodic information. 

Prosodies have been studied extensively 
in linguistics. However, they seem to be 
one of the most complex parts of language, 
and the information needed to generate 
them correctly from a text is not always 
available from current computational lin¬ 
guistic technology. 

For example, prosodies depend in part 
upon syntactic structure. Usually, a pro¬ 
sodic break occurs between a long subject 
and the predicate in an English sentence. 
Because no current natural language parser 
is robust enough for use with general writ¬ 
ten English texts, we cannot always find 
this break reliably. 

Prosodies generally prove the most dif¬ 
ficult part of language for a text-to-speech 
conversion system. A very good prosodic 
component would probably require that 
the computer actually “understand” the 
text. However, an adequate if rather mono¬ 
tonic or dull reading is possible using the 


most neutral or general case of current 
prosodic models. For example, textual 
punctuation marks usually indicate pro¬ 
sodic boundaries, and the last word in a 
phrase normally gets the primary empha¬ 
sis. Articles and prepositions are de- 
emphasized and associated with minor 
phrase boundaries. 

The resulting speech sounds somewhat 
boring, but most attempts to make it more 
lively result in some sentences having a 
foolish-sounding prosody. Most people 
might also, when presented with a text and 
asked to read it aloud, use something closer 
to the prosody produced by a text-to- 
speech system than they would use in 
speaking naturally. 


Phonetic rules 

After we have applied the text normal- 
izer, word pronunciation rules, and pros¬ 
ody rules, the utterance consists of a string 
of phonemes. Phonemes are the symbols 
used by some dictionaries to represent 
pronunciation. Although a long way from 
actual speech sounds, they represent the 
finest distinctions that most people not 
trained in phonetics normally notice. The 
concept of “the phoneme” covers a range 
of recognizable expressions, not a point. 
The distribution pattern for various pho¬ 
neme realizations is part of what defines 
dialects. Ordinary, casual speech repre¬ 
sents phonemes in about the same way 
that messy human handwriting represents 
the abstract ideal graphic letters of the 
alphabet. 

The phonetic rule module takes pho¬ 
nemes as input to produce a detailed de¬ 
scription of the sounds of the utterance. 
The rules in this module do things such as 
assign a duration to each phoneme accord¬ 
ing to context, for example, lengthening 
the final vowel in a phrase by up to 50 
percent. Other rules remove parts of pho¬ 
nemes, for example, deleting the aspira¬ 
tion of a stop such as /k/ in a number of 
different contexts. 

Phonetic rules play the major role of 
describing the co-articulation between 
phonemes. Each phoneme strongly influ¬ 
ences the parameters in the adjacent pho¬ 
neme. In some cases, this influence might 
extend over several adjacent phonemes. 

A great many phonetic rules must be 
programmed in a text-to-speech conver¬ 
sion system. Without these rules, the 
speech output might be somewhat recog¬ 
nizable, but it would sound very unnatural. 
In American English, for example, the 


phoneme /t/ should sound quite different in 
the words “Tom,” “cat,” and “butter.” 

In rapid human speech, the application 
of phonetic rules is sometimes quite ex¬ 
treme. Consider, for example, the classic 
case of the phrase “Did you eat yet?” This 
phrase can reduce to something closer to 
“Chee chet?” 

People can make such reductions be¬ 
cause they understand and get feedback 
from their communication environment. 
Thus, they can judge the degree to which 
reduction is appropriate. If it is not appro¬ 
priate, they can always speak more 
formally. 

While we could certainly program such 
“fast speech” rules for a text-to-speech 
conversion system, it is probably more 
appropriate to maintain a more formal style 
of speech. For example, “want to” or 
“thank you” often reduce to “wanna” or 
“thank ya” when spoken. However, an 
electronic mail system that read its mes¬ 
sages aloud over the telephone in such an 
informal style would seem odd and out of 
place in a business context. 

Voice tables 

The phonetic rules provide a detailed 
phonetic description for an utterance. The 
voice table module converts this descrip¬ 
tion into numeric targets for use by the 
voice model. The phonetic rules in con¬ 
junction with the voice tables are the pri¬ 
mary determinants of voice intelligibility. 

Intelligibility is the likelihood that a 
human listener will be able to identify a 
particular word spoken. In one standard 
test, 5 a modern, high-quality text-to- 
speech system typically scores approxi¬ 
mately 95 percent. In contrast, low-end 
systems score from 60 to 75 percent. Care¬ 
ful human speakers typically produce 
about half the listener error rate of the best 
text-to-speech systems, although some 
communication channels carrying human 
speech elicit scores as low as 85 percent on 
this test. 

One major function of the voice tables is 
to handle differences in bandwidth. Speech 
researchers have usually assumed a mini¬ 
mum speech bandwidth of 5 or 6 kHz. 
Unfortunately, telephone system design¬ 
ers have traditionally employed a band¬ 
width of 3.5 kHz, and even the newest 
digital telephones have perpetuated that 
bandwidth. 

Important features of fricatives and 
stops occur in the frequency range of 4 to 
6 kHz. Human speech is so redundant that 


22 


COMPUTER 











generally we don't miss this frequency 
range. However, text-to-speech systems 
can easily suffer losses unless we account 
for this difference in frequency range. 

Our solution is to have different sets of 
voice tables for different frequency ranges. 
For applications involving telephones, 
some of the frequency components above 4 
kHz can be mapped to a lower frequency, 
potentially contributing to overall speech 
quality and intelligibility. 

Hardware 

implementation 

In the work that we at Berkeley Speech 
Technologies have done on our own text- 
to-speech system, a major goal has been to 
develop a highly portable version of text- 
to-speech conversion software. We want 
to be able to deliver the same high quality 
of speech on any hardware platform that 
meets certain minimum requirements. 

In terms of computational complexity, 
our text-to-parameter conversion process 
requires from 250 to 350 kilobytes of 
memory and a processing power of about 
0.2 million instructions per second. Since 
this module is not very computationally 
demandinjg, we coded it entirely in the C 
language. 

The voice model, on the other hand, 
requires less than 1 Kbyte of code but on 
the order of 1 MIPS. Since this module is 
small and usually time critical, we almost 
always encode it in assembly language. 

The design of our software, coupled with 
recent advances in available hardware, has 
allowed us to successfully implement text- 
to-speech capability on a wide variety of 
hardware platforms. These include 

• A pocket-sized talking dictionary with 
an 83,000-word vocabulary. It contains no 
digital signal processing chip, but synthe¬ 
sizes words in real time in an 8-MHz NEC 
V20 microprocessor. 

• A telephone voice-response board that 
simultaneously generates 16 channels of 
speech from text in real time, then converts 
the speech waveforms into 32 kilobit-per- 
second ADPCM (adaptive delta pulse code 
modulation) coded speech. 

• A stand-alone text-to-speech conver¬ 
sion unit with a built-in speaker and a 12- 
volt power supply, designed for speaking 
text messages sent to trucks by satellite. 

• A personal speech-output device for 
blind people. Using CMOS hardware 
makes possible portability and battery- 
powered use with laptops. It speaks up to 
700 words a minute. 


Other approaches 

The principal high-end commercial text- 
to-speech systems for American English 
are Digital Equipment’s DECtalk, Cen¬ 
tigram’s Speech Plus, and our own BST 
T-T-S. Both DECtalk and Speech Plus 
are based on the work of Dennis Klatt at 
the Masschusetts Institute of Technology. 6 

Broadly speaking, Klatt’s approach re¬ 
sembles the approach described in this 
article, based on an elaborate, fully syn¬ 
thetic, formant-based model of voice 
production. While the systems differ in 
many ways, the overall effect is roughly 
similar. 

At the other end of the spectrum, a 
number of “toy” quality synthesizers have 
appeared on the market for use with low- 
end personal computers. Generally, they 
have relied either on recorded waveforms 
or on one of the “speech synthesis” chips. 7 

The main problem with this approach is 
that these chips are programmed to take 
phonemes as input and to generate be¬ 
tween 64 and 128 presynthesized seg¬ 
ments. However, these chips have not 
implemented the large number of phonetic 
rules that map phonemes into speech 
sounds, so have been fairly unintelligible. 

Another approach, under the names 
“demi-syllable,” “diphone,” or “dyad” 
synthesis, involves recording several thou¬ 
sand segments of human speech using 
some form of LPC encoding. 8 These seg¬ 
ments represent all of the theoretically 
possible transitions between adjacent pho¬ 
nemes or sequences of phonemes. Such an 
approach eliminates the need for some of 
the co-articulation rules but in return 
makes some of the other phonetic rules 
more difficult to apply. Some good re¬ 
search implementations have used this 
approach but, I believe, it does not solve 
any significant problems. Moreover, it has 
not been adopted in the highest quality 
commercial text-to-speech systems. 


U nrealistic expectations for dra¬ 
matic future improvement in text- 
to-speech technology sometimes 
arise from an unsophisticated view of the 
complex linguistic information involved. 
Improvement will continue at the same 
slow, steady pace that has produced incre¬ 
mental progress in accuracy, intelligibilty, 
and naturalness over the past three 
decades. However, we can expect faster 
progress in methods of delivering the 
technology. ■ 


References 


1. G. Fant, Acoustic Theory of Speech Produc¬ 
tion, Mouton and Co., ’s-Gravenhage, Neth¬ 
erlands, 1960. 

2. J.L. Flanagan, Speech Analysis Synthesis 
and Perception, 2nd ed., Springer-Verlag, 
New York, 1972. 

3. D.H. Klatt and L.C. Klatt, “Analysis, Syn¬ 
thesis, and Perception of Voice Quality 
Variations Among Female and Male Talk¬ 
ers,”/. Acoustical Society of America, Vol. 
87, Feb. 1990, pp. 820-857. 

4. J.B. Carrol et al., Word Frequency Book, 
American Heritage, New York, 1971. 

5. J.S. Logen, B. G. Greene, and D.B. Pisoni, 
“Segmental Intelligibility of Synthetic 
Speech Produced by Rule,” J. Acoustical 
Society of America, Vol. 86, No. 2, Aug. 
1989, pp. 566-581. 

6. D.H. Klatt, “Review of Text-to-Speech 
Conversion for English,” J. Acoustical Soci¬ 
ety of America, Vol. 82, No. 3, Sept. 1987, 
pp. 737-793. 

7. I.H. Witten principles of Computer Speech, 
Academic Press, New York, 1982. 

8. J.P. Olive, “Rule Synthesis of Speech from 
Diadic Units,” Proc. 1977 IEEE Int’l Conf. 
Acoustics, Speech, and Signal Processing, 
Vol. ICASSP-77, pp. 568-570. 



Michael H. O’Malley founded Berkeley 
Speech Technologies (initially called Berkeley 
Systems Works) in 1979. As president and chief 
scientist at BST, he has been engaged in the 
development of advanced text-to-speech sys¬ 
tems since 1980. He has engaged in speech 
research since 1961, when he worked for IBM 
Research on a text-to-speech project while still 
a student at the California Institute of Technol¬ 
ogy. He did his PhD work in the University of 
Michigan Program in Communications Sci¬ 
ences, where he studied computer science, elec¬ 
trical engineering, linguistics, and biological 
systems. 

From 1968 through 1973 O’Malley directed 
the Phonetics Laboratory at the University of 
Michigan as a member of the faculty. He was a 
principal investigator in the ARPA Speech 
Recognition Project, which he began at Michi¬ 
gan and continued at the University of Califor¬ 
nia at Berkeley as a member of the UC Com¬ 
puter Science Department. 

Readers can contact the author at Berkeley 
Speech Technologies, 2409 Telegraph Ave., 
Berkeley, CA 94705. 


August 1990 


23 










Fifth ieeeinternational symposium 

ON INTELLIGENT CONTROL 

• Penn Tower Hotel - Philadelphia • 

September 5-7, 1990 

IEEE Sponsored by IEEE Control Systems Society 




The IEEE International Symposium on Intelligent Control is the annual meeting dedicated to 
theoretica and practical problems in control systems associated with intelligence (Al, Neural, etc.). 

This symposium is dedicated to 

PERCEPTION-REPRESENTA T10N-ACTION TRIAD 

General Chairman: A. Meystel 
Program Chairmen: 

H. Kwatny, S. Navathe, H. Wechsler 
Publicity Chairman: J. Herath 
Local Arrangements Chairman: W. S. Gray 
Tutorial Chairman: B. W. Johnson 


Program Committee: 

J. Albus, National Institute of Standards and Technology 

L. Acar,Univ. of Missouri-Rolla 
P. Antsaklis, Univ. of Notre Dame 

R. Arkin, Georgia Institute of Technology 

T. Yuba, Electrotechnical Laboratory, Japan 

J. Baillieul, Boston Univ. 

D. Ballard, Univ. of Rochester 
A. Benveniste, IRISA, France 

G. Blankenship, Univ. of Maryland 
A. Borgida, Rutgers Univ. 

A. Buchmann, GTE Labs 
A. Cornelio, Univ. of Florida 

U. Chakravarthy, Univ. of Florida 

K. Doty, Univ. of Florida 

T. Fukuda, Nagoya Univ., Japan 
J. Gelfand, David Sarnoff Research Center 

E. Grant The Turing Institute, Glasgow, UK 

M. Grimble, Univ of Strathclyde, UK 

A. Goldenberg, Univ. of Toronto, Canada 
A. Guez Drexel Univ. 

M. Gupta, Univ. of Saskatchevan, Canada 

If. Hayward McGill Univ., Canada 

M. Herman, National Institute of Standards and Technology 

C. R. Johnson Cornell Univ. 

M. Kam, Drexel Univ. 

M. 0. Kaynak Bogazici Univ., Turkey 

L. Kerschberg, George Mason Univ. 

M. Ketabchi, Santa Clara Univ 

E. Krotkov, Carnegie-Mellon Univ. 

D. Liftman, George Mason Univ. 

A. Jayasumana, Colorado State Univ 

N. Mattos, Univ. of Kaiserslauten, W.est Germany 

Y. Ogiwara, Univ. of Electrocommunications, Japan 
P. Meer, Univ. of Maryland 

S. Y. Not Purdue Univ. 

G. Ozsoyoglu. Case Western Reserve Univ 

J. Principe, Univ. of Florida 

L. Raschid, Univ. of Maryland 

K. W. Ross Univ. of Pennsylvania 

F. M. A. Saiam Michigan State Univ 

H. Slephanou, George Mason Univ 

C. De Silva, Univ. of British Columbia, Canada 

D. Tesar, Univ. of Texas at Austin 
J. Tou, Univ. of Florida 

S. Tzafestas, National Technical Univ of Athens, Greece 

C. Weisbin, JPL 

K-K.D. Young Lawrence Livermore Lab. 

N. Saito, Keio Univ., Japan 


Suggested Topics of papers 

are not limited to the following list: 

• Intractable control problems in the perception-representation- 
action loop 

• Control with perception driven representation 

• Multiple modalities of perception, and their use for control 

• Control of movements required by perception 

• Control of systems with complicated dynamics 

• Intelligent control for interpretation in Biology and Psychology 

• Actively building-up representation systems 

• Identification and estimation of complex events in unstructured 
environment 

• Explanatory procedures for constructing representations 

• Perception for control of goals, subgoals, tasks, assignments 

• Mobility and manipulation 

• Reconfigurable systems 

• Intelligent control of power systems 

• Intelligent control in automated manufacturing 

• Perception driven actuation 

• Representations for intelligent controllers (geometry, physics, 
processes) 

• Robust estimation in intelligent control 

• Decision making under uncertainty 

• Discrete event systems 

• Computer-aided design of intelligent controllers 

• Dealing with unstructured environment (perception/motion) 

• Learning and adaptive control systems 

• Autonomous systems 

• Intelligent material processing: perception based reasoning 

• Intelligence computing systems 











SYMPOSIUM WILL CONSIST OF 
THREE SPECIALIZED MINI-CONFERENCES: 


Perception as a source of knowledge for control (Chair - H. Wechsler) 
Knowledge as a core of perception-control activities (Chair - S. Navathe) 
Decision and control via perception and knowledge (Chair - H. Kwatny) 

Intersected by 

SEVERAL PLENARY PANEL SESSIONS: 

I. Perception in the loop 

II. Action in the loop 

III. Knowledge representation in the loop and others 

Invited sessions on: 


• Autonomous systems 
• Intelligent distributed control 
• Hetrogeneous knowledge organization 
• Neural Controllers 
• Intelligence Computing 

Tutorials on Autonomous Mobile Robots, Neural Networks and 
Advanced High Performance Computing Systems: 


For additional information, please contact Dr. A. Meystel, (215) 895-2220; Dr. J. Herath, (215) 895-6758; or Dr. S. Gray, (215) 895-6762. 
ECE Dept., Drexel University, Philadelphia, PA 19104. 


REGISTRATION FEES 

On/before Aug 5, 1990 After Aug 5, 1990 

□ Student □ lEEE-Membe □ Other □ Student □ lEEE-Member □ Other 
$50 $200 $230 $70 $220 $275 

CANCELLATION FEE $20 Before August 27 

Note — fee includes (except for students) lunch, coffee & pastries, social events and a 
copy of the proceedings. Student fee covers attendance of sessions and a copy of the 
proceedings only. Payment - In US dollars only: by check; payable to IC-90. Send check 
and registration form to: 

Intelligent Control-1990, Department ECE, Drexel University, Philadelphia, PA 19104 

Name (First) _ (Last) _ 

Affiliation _ 

Student ID # (if applicable) _ 


City _ State _ Zip Code _ Country _. 


THE PENN TOWER HOTEL RESERVA TION 

Name (First) _; _ (Last)_ _ 

Address _ Telephone ( _ ) _ 

City _ State _ ZipCode _ Country _ 

Arrival Date _ Departure Date _ 

Single ($80.00) _ Double ($90.00) _ 

Credit Card (AX, CB, DC, Visa, MC) _ t _ Exp Date _y_y_ 

To guarantee a reservation, please forward a one night's deposit or include your valid credit 
card number with expiration date. Any reservations without a guarantee will only be held until 
6:00 p.m. on date of arrival. Please return to: The Penn-Tower Hotel; Civic Center Boulevard 
at 34th St., Philadelphia, PA 19104; 215/387-8333 


























An Introduction to Speech 
and Speaker Recognition 

Richard D. Peacocke and Daryl H. Graf 
Bell-Northern Research 



Speech recognition, 
the ability to identify 
spoken words, and 
speaker recognition, 
the ability to identify 
who is saying them, 
are becoming 
commonplace 
applications 
of speech processing 
technology. 


mining what was said, you determine who 
said it. Deciding whether or not a particu¬ 
lar speaker produced the utterance is called 
verification, and choosing a person’s iden¬ 
tity from a set of known speakers is called 
identification. The most general form of 
speaker recognition (text-independent) is 
still not very accurate for large speaker 
populations, but if you constrain the words 
spoken by the user (text-dependent) and 
do not allow the speech quality to vary too 
wildly, then it too can be done on a 
workstation. 

See the sidebar “Applications” for a 
description of typical speech and speaker 
recognition applications. 

Factors affecting 
speech recognition 


B eing able to speak to your personal 
computer, and have it recognize 
and understand what you say, 
would provide a comfortable and natural 
form of communication. It would reduce 
the amount of typing you have to do, leave 
your hands free, and allow you to move 
away from the terminal or screen. You 
would not even have to be in the line of 
sight of the terminal. It would also help in 
some cases if the computer could tell who 
was speaking. 

If you want to use voice as a new me¬ 
dium on a computer workstation, it is natu¬ 
ral to explore how speech recognition can 
contribute to such an environment. Here, 
we will review the state of speech and 
speaker recognition, focusing on current 
technology applied to personal worksta¬ 
tions. 

Limited forms of speech recognition are 
available on personal workstations. Cur¬ 
rently there is much interest in speech 
recognition, and performance is improv¬ 
ing. Speech recognition has already proven 
useful for certain applications, such as 
telephone voice-response systems for se¬ 
lecting services or information, digit rec¬ 
ognition for cellular phones, and data entry 
while walking around a railway yard or 
clambering over a jet engine during an 
inspection. 

Nonetheless, comfortable and natural 
communication in a general setting (no 
constraints on what you can say and how 


you say it) is beyond us for now, posing 
a problem too difficult to solve. Fortu¬ 
nately, we can simplify the problem to 
allow the creation of applications like the 
examples just mentioned. Some of these 
simplifying constraints are discussed in 
the next section. 

Speaker recognition is related to work 
on speech recognition. Instead of deter¬ 


Modern speech recognition research 
began in the late 1950s with the advent of 
the digital computer. Combined with tools 
to capture and analyze speech, such as 
analog-to-digital converters and sound 
spectrograms, the computer allowed re¬ 
searchers to search for ways to extract 
features from speech that allow discrimi¬ 
nation between different words. The 1960s 
saw advances in the automatic segmenta¬ 
tion of speech into units of linguistic rele¬ 
vance (such as phonemes, syllables, and 
words) and on new pattern-matching and 


26 


8-9162/90/0800-0026*01.00® 1 




COMPUTER 














Applications 

Although the performance of speech and speaker recogni¬ 
tion systems is far from perfect, these systems have already 
proven their usefulness for certain applications. 

Speech recognition. Currently, speech recognition is most 
often applied in manufacturing for companies needing voice 
entry of data or commands while the operator's hands are 
otherwise occupied. Related applications occur in product in¬ 
spection, inventory control, command/control, and material 
handling. Speech recognition also finds frequent application 
in medicine, where voice input can significantly accelerate 
the writing of routine reports. 

Speech recognition over the telephone network, although 
less used, has the greatest potential for growth. Automating 
the telephone operator’s job can greatly reduce operating 
costs for telephone companies. Furthermore, speech recog¬ 
nition can help users control the personal workstation or in¬ 
teract with other applications remotely when touch-tone key¬ 
pads are not available. (Telephone network applications are 
described in articles by Matthew Lennig and Ryohei Nakatsu 
elsewhere in this issue.) 

Finally, speech recognition offers greater freedom to the 
physically handicapped. 

Typical real-world applications: 

• Delco electronics employs IBM PC/AT-Cherry Electron¬ 
ics and Intel RMX86 recognition systems to collect circuit 
board inspection data while the operator repairs and marks 


the boards. 

• Southern Pacific Railway inspectors now routinely use a 
PC-based Votan recognition system to enter car inspection 
information from the field by walkie-talkie. 

• Michigan Bell has installed a Northern Telecom recogni¬ 
tion system to automate collect and third-number billed calls. 
AT&T has also put in field trial systems to automate call- 
type selection in its Reno, Nevada, and Hayward, California, 
offices. 

Speaker recognition. Speaker recognition has been ap¬ 
plied most often as a security device to control access to 
buildings or information. One of the best known examples is 
the Texas Instruments corporate computer center security 
system. Security Pacific has employed speaker verification 
as a security mechanism on telephone-initiated transfers of 
large sums of money. In addition to adding security, verifica¬ 
tion is advantageous because it reduces the turnaround time 
on these banking transactions. Bellcore uses speaker verifi¬ 
cation to limit remote access of training information to au¬ 
thorized field personnel. Speaker recognition also provides a 
mechanism to limit the remote access of a personal worksta¬ 
tion to its owner or a set of registered users. 

In addition to its use as a security device, speaker recogni¬ 
tion could be used to trigger specialized services based on a 
user’s identity. For example, you could configure an answer¬ 
ing machine to deliver personalized messages to a small set 
of frequent callers. 


classification algorithms. By the 1970s, a 
number of important techniques essential 
to today’s state-of-the-art speech recogni¬ 
tion systems had emerged, spurred on in 
part by the Defense Advanced Research 
Projects Agency speech recognition proj¬ 
ect. These techniques have now been re¬ 
fined to the point where very high recogni¬ 
tion rates are possible, and commercial 
systems are available at reasonable prices. 

Five factors can be used to control and 
simplify the speech recognition task 1 : 

(1) Isolated words. Speech consisting 
of isolated words (short silences between 
the words) is much easier to recognize than 
continuous speech because word bounda¬ 
ries are difficult to find in continuous 
speech. Also, coarticulation effects in 
continuous speech cause the pronunciation 
of a word to change depending on its posi¬ 
tion relative to other words in a sentence. 
For example, “did you?” is not the same as 
“did” + short silence + “you?” Other ef¬ 
fects depend on the rate of speaking as 
well, such as our tendency to drop the “t” in 


want when saying “want to” casually and 
quickly. 

Error rates can definitely be reduced by 
requiring the user to pause between each 
word. For example, in a study by Bahl et 
al., 2 error rates of 9 percent for continuous 
recognition decreased to 3 percent for iso¬ 
lated-word recognition. However, this 
type of restriction places a burden on the 
user and reduces the speed with which 
information can be input to the system 
(from a range of about 150-250 words per 
minute down to about 20-100 words per 
minute). 

(2) Single speaker. Speech from a 
single speaker is also easier to recognize 
than speech from a variety of speakers 
because most parametric representations 
of speech are sensitive to the characteris¬ 
tics of the particular speaker. This makes a 
set of pattern-matching templates for one 
speaker perform poorly for another 
speaker. Therefore, many systems are 
speaker dependent — trained for use with 
each different operator. Relatively few 
speech recognition systems can be used by 


the general public. A rule of thumb used by 
many researchers is that, for the same task, 
speaker-dependent systems will have error 
rates roughly three to five times smaller 
than speaker-independent ones. 

One way to make a system speaker inde¬ 
pendent is simply to mix training templates 
from a wide variety of speakers. A more 
sophisticated approach will attempt to look 
for phonetic features that are relatively 
invariant between speakers. 

(3) Vocabulary size. The size of the 
vocabulary of words to be recognized also 
strongly influences recognition accuracy. 
Large vocabularies are more likely to 
contain ambiguous words than small vo¬ 
cabularies. Ambiguous words are those 
whose pattern-matching templates appear 
similar to the classification algorithm used 
by the recognizer. They are therefore 
harder to distinguish from each other. Of 
course, small vocabularies composed of 
many ambiguous words can be particularly 
difficult to recognize. A famous example 
is the E-set, which consists of a subset of 
the English alphabet and digits: “B,” “C,” 


August 1990 


27 








Figure 1. Components of a typical speech recognition system. 


“D,” “E“G,” “P,” “T,” “V,” “Z,” and 
“three.” 

The amount of time it takes to search the 
speech model database also relates to vo¬ 
cabulary size. Systems containing many 
pattern templates typically require pruning 
techniques to cut down the computational 
load of the pattern-matching algorithm. By 
ignoring potentially useful search paths, 
pruning heuristics can also introduce rec¬ 
ognition errors. 

(4) Grammar. The grammar of the rec¬ 
ognition domain defines the allowable 
sequences of words. A tightly constrained 
grammar is one in which the number of 
words that can legally follow any given 
word is small. The amount of constraint on 
word choice is referred to as the perplexity 
of the grammar. Systems with low perplex¬ 
ity are potentially more accurate than those 
that give the user more freedom because 
the system can limit the effective vocabu¬ 
lary (and search space) to those words that 
can occur in the current input context. For 
example, a system described in Kimbal et 
al. 3 had an error rate of 1.6 percent with 
perplexity 19 (tightly constrained), while 
the error rate hit about 4.5 percent with 
perplexity 58 (more loosely constrained). 

(5) Environment. Background noise, 
changes in microphone characteristics, 
and loudness can all dramatically affect 
recognition accuracy. Many recognition 
systems are capable of very low error rates 
as long as the environmental conditions 
remain quiet and controlled. However, 
performance degrades when noise is intro¬ 
duced or when conditions differ from the 
training session used to build the reference 
templates. To compensate, the user must 
almost always wear a head-mounted, 
noise-limiting microphone with the same 
response characteristics as the microphone 
used during training. 


Components of a 
speech recognition 
system 

Most computer systems for speech rec¬ 
ognition include the following five com¬ 
ponents (see Figure 1): 

(1) A speech capture device. This usu¬ 
ally consists of a microphone and associ¬ 
ated analog-to-digital converter, which 
digitally encodes the raw speech wave¬ 
form. 

(2) A digital signal processing module. 
The DSP module performs endpoint (word 
boundary) detection to separate speech 
from nonspeech, converts the raw wave¬ 
form into a frequency domain representa¬ 
tion, and performs further windowing, 
scaling, filtering, and data compression. 4 
The goal is to enhance and retain only 
those components of the spectral represen¬ 
tation that are useful for recognition pur¬ 
poses, thereby reducing the amount of 
information that the pattern-matching al¬ 
gorithm must contend with. A set of these 
speech parameters for one interval of time 
(usually 10-30 milliseconds) is called a 
speech frame. 

(3) Preprocessed signal storage. Here, 
the preprocessed speech is buffered for the 
recognition algorithm. 

(4) Reference speech patterns. Stored 
reference patterns can be matched against 
the user’s speech sample once it has been 
preprocessed by the DSP module. This 
information is stored as a set of speech 
templates or as generative speech models. 

(5) A pattern matching algorithm. The 
algorithm must compute a measure of 
goodness-of-fit between the preprocessed 
signal from the user’s speech and all the 
stored templates or speech models. A se¬ 


lection process chooses the template or 
model (possibly more than one) with the 
best match. 

Two major types of pattern matching in 
use are template matching by dynamic time 
warping and hidden Markov models. Arti¬ 
ficial neural networks applied to speech 
recognition have also had some success, 
but this work is still in the early stages of re¬ 
search. 5 Moreover, linguistic knowledge 
incorporated into the pattern-recognition 
algorithm can enhance performance. How¬ 
ever, such sophisticated techniques lie out¬ 
side of the scope of this article (see, for 
example, O’Shaughnessy 4 and Mariani 6 ). 

Template matching by dynamic time 
warping became very popular in the 1970s. 
Template matching is conceptually simple. 
You want to compare the preprocessed 
speech waveform directly against a refer¬ 
ence template by summing the distances 
between respective speech frames. How¬ 
ever, biological limitations tend to produce 
nonlinear variations in timing from utter¬ 
ance to utterance. Consequently, the vari¬ 
ous frames of a word may be out of align¬ 
ment with the corresponding frames of the 
given template. Since the order of speech 
events is fairly constant, you correct the 
misalignment by stretching the template in 
some places and compressing it in others to 
find an optimum match. Dynamic program¬ 
ming helps compute the optimum match. 
The sidebar “Dynamic time warping” illus¬ 
trates the resulting time warp process. 

Hidden Markov models are used in most 
current research systems because this tech¬ 
nique produces better results for continu¬ 
ous speech with moderate-size vocabular¬ 
ies. HMMs are stochastic state machines 
that associate probabilities of producing 
sounds with transitions from state to state. 
An ideal HMM models speech with the 
same variations that occur in human speech 
due to coarticulation and other effects. 
Speech generated by a human being is 
matched against an HMM by computing 
the probability that the HMM would have 
generated the same utterance or by finding 
the state sequence through the HMM that 
has the highest probability of producing the 
utterance. The fact that HMMs generate 
poor-quality speech explains why recogni¬ 
tion based on HMMs is still not perfect. 

The sidebar “Hidden Markov models” 
further details the use of HMMs. Markov 
chains, although known about for almost a 
century, have only been successfully used 
in the context of speech recognition for the 
past 15 years or so. Until recently, no 
method existed for optimizing the model 


COMPUTER 































Dynamic time warping 


Frame distances between the pro¬ 
cessed speech frames and those of 
the reference templates are summed 
to provide an overall distance measure 
of similarity. But, instead of taking 
frames that correspond exactly in 
time, you would do a time “warp" on 
the utterance (and scale its length) so 
that similar frames in the utterance 
line up better against the reference 
frames. A dynamic programming pro¬ 
cedure finds a warp that minimizes 
the sum of frame distances in the tem¬ 
plate comparison. The distance pro¬ 


duced by this warp is chosen as the 
similarity measure. 

In the illustration here, the speech 
frames that make up the test and ref¬ 
erence templates are shown as scalar 
amplitude values plotted on a graph 
with time as the x axis. In practice, 
they are multidimensional vectors, and 
the distance between them is usually 
taken as the Euclidean distance. The 
graphs show how warping one of the 
templates improves the match be¬ 
tween them. (For further information, 
see chapter 10 of O'Shaughnessy. 4 ) 


Before time warp 
I < Amplitude 


Reference template 
Test template 




parameters to generate observed speech 
patterns. (The US Department of Defense 
actually suppressed publication of the ad¬ 
vances in HMM algorithms for a while in 
the mid-1970s, probably because of their 
use in cryptanalysis.) As well as represent¬ 
ing low-level speech segments and transi¬ 
tions, hidden Markov models provide a 
framework on which you can model higher 
level structures in continuous speech sig¬ 
nals and incorporate other knowledge 
about the communication. 


Current speech 
recognition systems 

Current speech recognition systems can 
be categorized according to the types of 
constraint they place on the speech. At one 
end of the spectrum fall speaker-independ¬ 
ent, continuous, unconstrained-grammar, 
large-vocabulary systems. These systems 
are still very much in the research stage. 

Several systems among those represent¬ 


ing the state of the art were trained and 
tested on the same speech data — the 
DARPA resource management database 
— and are easily compared. The DARPA 
resource management task involves que¬ 
ries and commands to a database of war¬ 
ships. The associated database consists of 
a 997-word vocabulary and grammars with 
various complexities. Sphinx, a recognizer 
developed at Carnegie Mellon University, 
has a maximum word-recognition accu¬ 
racy of 93.7 percent for a grammar of 
perplexity 60 and 70.6 percent for a gram¬ 
mar of perplexity 997.' BBN’s Byblos 7 
and a system developed at Lincoln Labs 8 
have word accuracies of 88.7 percent and 
87.4 percent, respectively, for the perplex¬ 
ity 60 grammar (BBN’s system requires 
about two minutes of speech to adapt to a 
particular speaker before reaching this 
level of performance). Texas Instruments* 
and Stanford Research Institute 9 have re¬ 
ported systems with 44.3 percent and 40.4 
percent accuracy on the perplexity 997 
grammar. These systems have considera¬ 
bly lower sentence accuracies. 

Representative of the state of the art in 
speaker-dependent, isolated-word, large- 
vocabulary recognizers are systems like 
IBM’s Tangora recognizer, which is ca¬ 
pable of 97 percent accuracy for a 20,000- 
word vocabulary 10 and NEC’s 97.5 percent 
accurate, 1,800-word system. 11 

A variety of other systems trade off 
constraints on the input speech for higher 
recognition accuracies. Among these are 
the AT&T Bell Labs telephone-grade, 
speaker-independent, connected-digit re¬ 
cognizer (98.5 percent accurate when the 
number of digits is known 12 ) and a speaker- 
dependent version of BBN’s Byblos, 
which measured 94.8 percent accurate on 
the perplexity 60 DARPA resource man¬ 
agement task. 

At the highly constrained speech end of 
the spectrum fall speaker-dependent, 
single-word, small-vocabulary recogni¬ 
tion systems. A variety of such systems 
developed can achieve accuracies above 
99 percent. 

Various commercial systems have ap¬ 
peared for Sun workstations and IBM- 
compatible PCs over the past few years. 
Table 1 summarizes the capabilities, costs, 
and manufacturers’ claimed accuracies of 
a sample of these commercial products. 
Although several companies advertise 
speaker-independent, continuous, large- 


* See Kai-fu 


Lee, 1 p. 133. 


August 1990 


29 
















Hidden Markov models 

A hidden Markov model (HMM) is a 
doubly stochastic process for produc¬ 
ing a sequence of observed symbols. 

An underlying stochastic finite state 
machine (FSM) drives a set of sto¬ 
chastic processes, which produce the 
symbols. When a state is entered after 
a state transition in the FSM, a symbol 
from that state’s set of symbols is se- 


State Possible Outputs 

1 A,a 

2 a 

3 B 

AAaaB could be produced by the fol¬ 
lowing state sequences: 

->1 -*1 ->1 ->1 -»3 
or —» 1 —» 1 —> 1 —> 2 —>• 3 

or -»1-»1-»2-»2-»3 

Although not shown in the example, 
probabilities are attached to the finite 
state transitions, and discrete probabil¬ 
ity distributions control the symbol out¬ 
put for each state (continuous density 
HMMs also exist). In the case of iso¬ 
lated word recognition, each word in 
the vocabulary has a corresponding 
HMM. These HMMs might actually 
consist of HMMs that model subword 
units such as phonemes connected to 
form a single word-model HMM. In the 
case of continuous word recognition, a 
single HMM corresponds to the do¬ 
main grammar. This grammar model is 
constructed from word-model HMMs. 
The observable symbols correspond to 
(quantized) speech frame measure¬ 
ments. 

An algorithm known as the forward/ 
backward (or Baum-Welch) algorithm 


lected probabilistically for output. The 
term “hidden” is appropriate because 
the actual state of the FSM cannot be 
observed directly, only through the 
symbols emitted. In the example 
here, the sequence of symbols 
AAaaB could have been produced by 
any of three different state transition 
sequences. 


finds a set of state transition proba¬ 
bilities and symbol output distribu¬ 
tions for each HMM. This gradient 
descent algorithm uses training data 
to iteratively refine an initial (possibly 
random) set of model parameters 
such that the HMM is more likely to 
generate patterns from the training 
set. 

After this initial training stage, a 
word or sentence to be recognized is 
spoken, and speech measurements 
are made that reduce the utterance to 
a sequence of symbols. In the case of 
isolated word recognition, the forward 
algorithm computes the probability 
that each word model produced the 
observed sequence of symbols — the 
model with the highest probability 
represents the recognized word. In 
the case of continuous recognition, 
the Viterbi algorithm finds the state 
transition path, through the grammar 
model, with the maximum likelihood 
of generating the set of measure¬ 
ments. The sequence of word models 
on this path corresponds to the rec¬ 
ognized sentence. (For further infor¬ 
mation see “Introduction to Hidden 
Markov Models” by L.R. Rabiner and 
B.H. Juang, published in IEEE Trans. 
Acoustics, Speech, and Signal Pro¬ 
cessing, Jan. 1986, pp. 4-16.) 


vocabulary speech recognition, they care¬ 
fully avoid making strong claims about the 
accuracy of their products. With commer¬ 
cial systems, you typically get what you 
pay for. Products available for less than 
$1,000 US are isolated-word, small-vo¬ 
cabulary recognizers. Speaker-dependent, 
isolated-word, large-vocabulary recogniz¬ 
ers for automated dictation are available 
for a few thousand dollars. You’ll see an 
order of magnitude leap in price when you 
move to large-vocabulary, speaker-inde¬ 
pendent, continuous-speech recognizers. 

Speaker recognition — 
the voice, not just the 
words 

Speaker recognition is related to speech 
recognition. When the task involves iden¬ 
tifying the person talking rather than what 
is said, the speech signal must be processed 
to extract measures of speaker variability 
instead of being analyzed by segments 
corresponding to phonemes or pieces of 
text one after the other. For speaker recog¬ 
nition, only one classification is made, 
based on part or all of an input test utter¬ 
ance. Although various studies have shown 
that certain acoustical features work better 
than others in predicting speaker identity, 
few recognizers examine specific sounds 
because of difficulties in phone segmenta¬ 
tion and identification. 

Both automatic speaker verification and 
speaker identification use a stored data¬ 
base of reference patterns (templates) for 
N known speakers. Both involve similar 
analysis and decision techniques. Verifi¬ 
cation is simpler because it only requires 
comparing the test pattern against one ref¬ 
erence pattern and it involves a binary 
decision: Is there a good enough match 
against the template of the claimed 
speaker? The error rate for speaker identi¬ 
fication can be much greater because it 
requires choosing which of the N voices 
known to the system best matches the test 
voice or “no match” if the test voice differs 
sufficiently from all the reference tem¬ 
plates. 

Comparing test and reference utterances 
for speaker identity is much simpler for 
identical underlying texts, as in text-de- 
pendent speaker recognition. With coop¬ 
erative speakers you can apply speaker 
recognition straightforwardly by using the 
same words to train the system and then 
test it. This usually happens in verification, 
but speaker identification often requires 



30 


COMPUTER 















Table 1. A sample of commercially available speech recognition systems (for IBM PCs unless otherwise indicated). 


Recognizer 

Constraints 

(Speaker/Speech/Vocabulary) 

Price (US $) 

Percent Word Accuracy 
(per the manufacturer) 

Dragon Voice-Scribe 400 

Speaker dependent 

Isolated-word recognition 

400 words 

$995 

>95 

Dragon Dictate 

Speaker adaptive 

Isolated-word recognition 

30,000 words 

$9,000 

>90 

ITT VRS 1280/PC 

Speaker dependent 

Continous-speech recognition 

2,000 words 

$9,000 

>98 

Phonetic Engine* 

(Speech Systems Inc.) 

Speaker independent 
Continous-speech recognition 
10,000-40,000 words 

$10,500-$47,100 

95 

Telerec 

(Voice Control Systems)** 

Speaker independent 
Connected-word recognition 

50 words 

$3,000 

>98 

Verbex series 5000, 6000, 
7000 

Speaker dependent 
Continuous-speech recognition 
80-10,000 words 

$5,600-$9,600 

>99.5 

Voice Card 
(Votan) 

Speaker dependent or independent 
Continuous-speech recognition 

300 words 

$3,500 

>99 (speaker dependent), 

95 (speaker independent) 

Voice Comm Unit 
(Fujitsu) 

Speaker dependent 

Connected-word recognition 

4,000 words 

Only in Japan 

99.9 

Voice Master Key 
(Covox) 

Speaker dependent 

Isolated-word recognition 

64 words 

$150 

95-96 

Voice Navigator) 

(Articulate Systems) 

Speaker dependent 

Isolated-word recognition 

1,000 words 

$1,300 

95 

Voice Pro 

(Voice Processing Corp.) 

Speaker independent 
Continuous-speech recognition 

13 words 

$5,000 

97-99 

Voice Report 
(Kurzweil AI) 

Speaker dependent 

Isolated-word recognition 

20,000 words 

$18,900 

98 

* Available for Sun workstations 

** VCS technology is used in Dialogic products 

t Available for Macin 

tosh (based on Dragon Systems technology) 


text-independent methods. Higher error 
rates for text-independent methods means 
you will need much more speech data both 
for training and testing. 


Automatic speaker recognition by com¬ 
puter has been an active research area since 
the early 1960s. A 1962 paper introduced 
the spectrogram as a means of personal 


identification, and this stimulated a good 
deal of further research. The term “voice- 
print” also appeared in that paper. Unfortu¬ 
nately, the analogy with fingerprint read- 


August 1990 


31 






ing is incorrect. As pointed out by Dod- 
dington, 13 the spectrogram is a function of 
the speech signal, not of the physical anat¬ 
omy of the speaker. The speech signal 
depends far more on the speaker’s actions, 
themselves a complex function of many 
factors, than on the shape of the speaker’s 
vocal tract. The term “voiceprint” is 
misleading. 

Speaker recognition 
systems 

Speaker recognition by computer has 
only had limited success to date in applica¬ 
tions using free text (text independent). 
Nonetheless, text-independent recogni¬ 
tion of speakers has become an increas¬ 
ingly popular area of research, particularly 
for applications such as forensic, intelli¬ 
gence gathering, and passive surveillance 
of voice circuits. Free-text recognition 
usually lacks control over conditions that 
influence system performance, including 
variability in the speech signal and distor¬ 
tions and noise in the communications 
channel. The recognition task faces mul¬ 
tiple problems: unconstrained input 
speech, uncooperative speakers, and un¬ 
controlled environmental parameters. This 
has made it necessary to focus on features 
and characteristics of speech unique to the 
individual. 

Performance of text-independent sys¬ 
tems has lagged behind that of text-de¬ 
pendent systems, as you might expect. 
However, Markel and Davis 14 achieved 
excellent results with a linguistically un¬ 
constrained database of unrehearsed 
speech. Using voice pitch and linear pre¬ 
dictive coding (LPC) reflection coeffi¬ 
cients in their model, they reached 2 per¬ 
cent identification error and 4 percent 
verification error rates for 40-second seg¬ 
ments of input speech. Results were not 
nearly as good with shorter input speech 
segments, even though the system avoided 
operational problems of microphone deg¬ 
radation, acoustic noise, and channel dis¬ 
tortion. In text-independent recognition of 
nine male speakers over a radio channel 
at Bolt Beranek and Newman, the best 
performance was a 30 percent error rate 
for input speech segments of about two 
seconds. 15 

Text-independent recognition seems 
mainly slated for unobtrusive surveillance 
of individuals. As mentioned earlier, text- 
independent speaker identification poses a 
difficult problem. The accuracy of state- 
of-the-art text-independent identification 


is low, and it requires continuous use of 
computing power. Text-dependent 
speaker verification has the greatest poten¬ 
tial for practical application at the moment. 
A number of organizations have research 
and development programs in speaker 
verification, and Texas Instruments and 
AT&T Bell Labs have both made major 
efforts in this research area. 

AT&T Bell Labs has concentrated on 
speaker recognition over telephone lines, 
which faces difficult problems of micro¬ 
phone and channel distortion. Speaker 
recognition over telephone lines opens up 
an enormous set of possible uses, such as 
identification for various kinds of transac¬ 
tion processing in banking, shopping, and 
database access. 

AT&T Bell Labs started its automatic 
speaker verification system in 1970. Re¬ 
searchers there chose measurements that 
are largely insensitive to the phase and 
spectral amplitude distortions likely over 
telephone lines. In an early five-month 
operational simulation, the system showed 
a user rejection rate and impostor accep¬ 
tance rate of about 10 percent initially for 
new users, dropping to about 5 percent for 
experienced users and fully adapted tem¬ 
plates. A more recent system used over 
telephone lines has achieved error rates 
(rejection of true speakers and acceptance 
of impostors) of approximately 2 percent. 16 

Texas Instruments has applied speaker 
verification to control access to its corpo¬ 
rate computer center. 13 The gross rejection 
rate of the operational system measured 
0.9 percent, with a casual impostor accep¬ 
tance rate of 0.7 percent. The system has 
been operational 24 hours a day for more 
than a decade. The verification step uses a 
comparison of dynamic features, and time 
alignment is established using a simplified 
form of dynamic time warping. Verifica¬ 
tion utterances are constructed randomly 
using a four-word fixed phrase structure, 
for example, “Proud Ben served hard.” As 
well as indicating what the user should say, 
the voice prompt helps stabilize pronun¬ 
ciation because the user tends to say it in 
the same way as prompted. Because a 
single utterance cannot provide the high 
level of verification performance desired, 
the system employs a sequential decision 
using multiple phrases when necessary. In 
operation, the actual number of phrases 
used averages 1.6. 

Several companies offer speaker verifi¬ 
cation products, and voiceprints have 
emerged again. The claim of a voiceprint 
technique comes from Ecco Industries of 
Danvers, Mass. Ecco hopes to market a 


$300 voice-recognition security device for 
consumers this year. The intention is to 
create a digital picture of an individual’s 
vocal tract so that bad colds or mimics will 
not trip up the device. US Sprint is testing 
a phone card that can be used to recognize 
the caller’s voice. Card holders initiate 
calls in the usual way by dialing an 800 
number, then dialing a personal identifica¬ 
tion number. A voice prompt asks callers 
to speak their password. The system is 
intended to reduce credit card fraud, in¬ 
cluding elimination of card-swapping. 
Bellcore has developed a similar credit 
card, which stores the template of a spoken 
word as 6,400 bits. According to Bellcore, 
the system recognizes the speaker 99 per¬ 
cent of the time, but the error rate can 
double to 2 percent if the speaker has a 
cold. 


S peech recognition technology has 
migrated from mini- and mainframe 
computers to workstations and per¬ 
sonal computers, and applications are al¬ 
ready running on them. As embedded digi¬ 
tal signal processors become more preva¬ 
lent on workstations (such as the Next 
computer), we expect to see much wider 
use of speech and speaker recognition. 

Current applications depend on the use 
of various simplifying constraints that 
make speech recognition feasible, as dis¬ 
cussed above. The dependence on them 
means that, although useful practical ap¬ 
plications of speech recognition exist, we 
have not yet achieved comfortable and 
natural communication with computers 
through voice. Text-dependent speaker 
recognition exists in the form of opera¬ 
tional systems, but accurate text-independ¬ 
ent speaker recognition remains a target. 

Improvements in the speech and speaker 
recognition techniques discussed here will 
no doubt advance the performance of rec¬ 
ognition systems, but it seems likely that 
we will also need natural language under¬ 
standing before we can achieve comfort¬ 
able and natural communication with 
computers through voice. ■ 


Acknowledgments 

We are grateful for suggestions from the 
anonymous referees, which have improved the 
article, and to Jerome Chiabaut and Taro Shiba- 
hara for help with an earlier version. 


32 


COMPUTER 










References 


1. K. Lee, Automatic Speech Recognition: the 
development of the Sphinx System, Kluwer 
Academic Publishers, Norwell, Mass., 
1989. 

2. L. R. Bahl et al., “Speech Recognition of a 
Natural Text Read as Isolated Words,” 
Proc. IEEE Int’l Conf. Acoustics, Speech, 
and Signal Processing, April 1981, pp. 
1,168-1,171. 

3. D.O. Kimbal et al., “Recognition Perform¬ 
ance and Grammatical Constraints,” Proc. 
DARPA Speech Recognition Workshop, 
Feb. 1986, pp. 53-59. 

4. D. O’Shaughnessy, Speech Communica¬ 
tion: Human and Machine, Addison- 
Wesley, Reading, Mass., 1987. 

5. M.A. Franzini, M.J. Witbrock, and K.-F. 
Lee, “Speaker-Independent Recognition of 
Connected Utterances Using Recurrent and 
Nonrecurrent Neural Networks,” Proc. 
Int’l Joint Conf. Neural Networks, Vol.2, 
Washington, DC, June 1989, pp.II-1 to II- 
6. 

6. J. Mariani, “Recent Advances in Speech 
Processing,” Proc. IEEE Int’l Conf. Acous¬ 
tics, Speech, and Signal Processing, 
Glasgow, Scotland, May 1989, pp. 429- 
440. 

7. M.-W. Fung et al., “Improved Speaker 
Adaptation Using Text-Dependent Spec¬ 
tral Mappings,” Proc. IEEE Int’l Conf. 
Acoustics, Speech, and Signal Processing, 
New York City, 1988, pp. 131-134. 

8. D.B. Paul, “The Lincoln Robust Continu¬ 
ous Speech Recognizer,” Proc. IEEE Int’l 
Conf. Acoustics, Speech, and Signal Proc¬ 
essing, Glasgow, Scotland, 1989, pp. 449- 
452. 

9. H. Murveit and M. Weintraub, “1,000- 
Word Speaker-Independent Continuous- 
Speech Recognition Using Hidden Markov 
Models,” Proc. IEEE Int’l Conf. Acoustics, 
Speech, and Signal Processing, New York 
City, 1988, pp. 115-118. 

10. W. Wylegala, “A 20,000-Word Recognizer 
Based on Statistical Evaluation Methods,” 
Speech Technology Magazine, Apr./May 
1989, pp. 16-18. 

11. K. Yoshida, T. Watanabe, and S. Koga, 
“Large Vocabulary Word Recognition 
Based on a Demi-Syllable Hidden Markov 
Model Using a Small Amount of Training 
Data,” Proc. IEEE Int’l Conf. Acoustics, 
Speech, and Signal Processing, Glasgow, 
Scotland, 1989, pp. 1-4. 

12. L.R. Rabiner, J.G. Wilpon, and F.K. Soong, 
“High-Performance Connected-Digit Rec¬ 
ognition, Using Hidden Markov Models,” 
Proc. IEEE Int’l Conf. Acoustics, Speech, 
and Signal Processing, New York City, 
1988, pp. 119-122. 


13. G. R. Doddington, “Speaker Recognition — 
Identifying People by their Voices,” Proc. 
/£££, Vol.73,No. 11, Nov. 1985, pp. 1,651- 
1,664. 

14. J.D. Markel and B. Davis, “Text-Independ¬ 
ent Speaker Recognition from a Large Lin¬ 
guistically Unconstrained Time-Spaced 
Database,” IEEE Trans. Acoustics, Speech, 
and Signal Processing, Vol. ASSP-27, No. 
1, 1979, pp. 74-82. 



Richard D. Peacocke is a member of the 
Computing Research Laboratory at Bell-North¬ 
ern Research in Ottawa. At BNR he has worked 
on office communications and software engi¬ 
neering and has led groups working in software 
quality assurance and artificial intelligence. He 
has published a number of technical papers, 
including several on expert systems and tele¬ 
communications. His current research interests 
include Al, knowledge-based software, and 
speech technology. 

Peacocke is president of the Canadian Soci¬ 
ety for Computational Studies of Intelligence 
and a member of the IEEE Computer Society, 
AAAI, and ACM. He is also an adjunct profes¬ 
sor in the Department of Systems and Computer 
Engineering at Carleton University, Ottawa. 

Peacocke has a PhD in computer science 
from the University of Toronto, Canada. He 
received a BA in mathematics from Cambridge 
University and an MS in computing science 
from the University of Alberta, Canada. 


15. M. Krasner et al., “Investigation of Text- 
Independent Speaker Identification Tech¬ 
niques Under Conditions of Variable 
Data,” Proc. IEEE Int’l Conf. Acoustics, 
Speech, and Signal Processing, 1984, 
Paper 18B.5. 

16. M.R. Birnbaum, L.A. Cohen, and F.X. 
Welsh, “A Voice Password System for 
Access Security,” AT&T Tech. J., Vol. 65, 
No. 5, Sept./Oct. 1986, pp. 68-74. 



Daryl H. Graf is a member of the Computing 
Research Laboratory at Bell-Northern Re¬ 
search in Ottawa, Canada. Since joining BNR 
in 1979, he has acquired extensive experience 
in the telecommunications field. As a former 
manager of BNR’s international operator serv¬ 
ices design group, he is interested in the appli¬ 
cation of speech processing technology to the 
operator services area. His current research 
interests include adaptive systems for pattern 
classification, and he has published technical 
papers on the use of neural networks for robot 
control. 

Graf received the BS with honors in com¬ 
puter science from the University of Calgary in 
1979 and the MCS with distinction from Carle- 
ton University, Ottawa, in 1988. He is a mem¬ 
ber of the IEEE, CSCSI, INNS, and AAAI. 


Readers may contact the authors at the Computing Research Laboratory, Bell-Northern Re¬ 
search, PO Box 3511 Station C, Ottawa, Ontario, Canada K1Y 4H7, Bitnet: richard@bnr.ca or 
dgraf@bnr.ca. 


Moving? 

PLEASE NOTIFY 

Name (Please Print) 

US 4 WEEKS 

IN ADVANCE 

New Address 


City State/Country Zip 

MAIL TO: 

IEEE Service Center 

445 Hoes Lane 

• This notice of address change will apply to all 
ATTACH IEEE publications to which you subscribe. 

LABEL . List new address above. 

Piscataway, NJ 08854 

HERE . |f y 0U have a question about your subscription, 

place label here and clip this form to your letter. 


August 1990 


33 















Communications 
Come Together 
At Codex. 


We’re providing network solutions that no one else can, because we 
aren’t limited to any one networking technology. Our mix of products 
and services includes Tl, X.25, stat multiplexers and other analog and 
digital transmission devices. Our network management systems actually 
predict and help solve problems. And we’ve been providing technical 
leaders help, innovation and unsurpassed reliability in our industry for 
more than a quarter of a century. 

If your career is at a crossroad and you’re seeking a renewed challenge 
and spirit of growth, we invite your inquiry. Our unique Advanced 
Sourcing concept allows you to investigate opportunities as they arise 
and when you feel the time is right, to take advantage of that opportu¬ 
nity. 


Software Development Engineers 

Positions are at all levels, from 2 + years' experience and up. If you 
possess a BSEE or BSCS (MSEE or MSCS preferred) with experience in 
one or more of the following areas, we invite your inquiry. 

• “C” or Object-Oriented Language 

• ISDN Signalling Technology 

• IBM SNA or X.25 Protocols 

• Communication protocol software 

• Loading, initialization, configuration and fault management software 
for networking nodes 

• Directory services and call processing software 

• Network design algorithms 

• Network Management 

• IBM/PC applications software under MS Windows 

• Database definition and/or Database tool development 

• IBM Netview 

• Digital Signal Processing 

• Embedded System Software Development 

These positions are located in our R&D facility in Canton, Massachusetts, 
one of greater Boston’s most desirable suburbs. We are less than an 
hour from world-class cultural, medical and educational opportunities 
and close by all of New England’s four-season recreational areas. 
Affordable housing is within commuting distance. 

Codex offers an excellent environment conducive to profession¬ 
al growth, competitive salaries and a comprehensive benefits 
package including profit sharing, 40IK and a generous pension 
plan. Our Advanced Sourcing Concept allows you to add your 
resume to our database now and be reviewed against current 
and future openings based on your goals and requirements. If 
you are unable to call, please send your resume to the Advanced 
Sourcing Group, Codex Corporation, Dept. I3ECOMP890, 20 
Cabot Boulevard, MS M4-70, Mansfield, MA 02048. An Equal 
Opportunity Employer. 



COdCX liIr RKiNG 

MOTOROLA 















Putting Speech 
Recognition to Work 
in the Telephone 
Network 


Matthew Lennig 

Bell-Northern Research and INRS-Telecommunications 


T he interactive voice technologies 
include speech recognition, 
speaker verification, speech en¬ 
coding and decoding, and speech synthe¬ 
sis. These technologies provide a way for 
people to interact verbally with computers. 
Voice interaction is particularly useful 
over the telephone because it allows people 
to communicate directly with computers to 
perform simple tasks without the need for 
operators. 

Of all the interactive voice technologies, 
perhaps the most challenging is speech 
recognition, because of the inherent vari¬ 
ability in the way we speak. Speaker-inde¬ 
pendent speech recognition, in which the 
computer interacts with people who have 
not previously “trained” it to their speech 
characteristics, is more difficult than 
speaker-dependent recognition, in which 
the system has been trained to a particular 
user’s speech. When the speech to be rec¬ 
ognized is transmitted over the telephone 
network, further variability is imposed by 
the varying quality of network connec¬ 
tions. On top of these difficulties, the tele¬ 
phone network removes potentially useful 
information for word discrimination by 
cutting out spectral energy in the speech 
signal above about 3,300 Hz and below 
about 300 Hz. 



The early success of 
an automated call¬ 
handling system using 
interactive voice 
technologies 
foreshadows huge 
savings for telephone 
companies and a 
wealth of new services 
for consumers. 


Speaker-independent speech recogni¬ 
tion was the biggest technical hurdle in the 
development of Northern Telecom’s auto¬ 
mated alternate billing service (AABS) for 
collect calls, third-number-billed calls, 
and calling-card-billed calls. The AABS 

0018-9162/90/0800-0035501.00 © 1990 IEEE 


system automates a collect call by record¬ 
ing the calling party’s name, placing a call 
to the called party, playing back the calling 
party’s name to the called party, informing 
the called party that he or she has a collect 
call from that person, and asking, “Will 
you pay for the call?” The called party 
responds yes or no to the speech recog¬ 
nizer, and the call is completed or not, 
accordingly. 

Before embarking on the AABS project, 
Bell-Northern Research ran a concept trial 
in the first half of 1988 with Bell Canada’s 
public customers to gain a better under¬ 
standing of the real-world behavior of 
speech recognition technology and to find 
out how acceptable it would be to the 
public. The application was a directory of 
“dial-it” services available in the 514 area 
code (Montreal), which callers accessed 
by dialing numbers beginning 9-7-6. Of¬ 
fered to the public free of charge, the direc¬ 
tory (in French) interacted with the caller, 
using stored-speech playback and speech 
recognition with a total vocabulary of 25 
words to provide information on the ser¬ 
vices available, their telephone numbers, 
and their prices. 

During the trial we found that when 
input words were within the speech 
recognizer’s vocabulary, the rate of substi- 

35 


August 1990 











Voice Service Node 



Calling party Called party 


Figure 1. Network configuration of the Voice Service Node for automation of 
collect, third-number-billed, and calling card calls. 


tution (recognition of an incorrect word) 
was 1 percent and the rate of rejection (the 
input word not accepted as valid) was 3 
percent. 1 When input words were from 
outside the recognizer’s vocabulary, a 50- 
percent false-acceptance rate occurred— 
that is, the recognizer rejected only half of 
these invalid words. For the directory 
application, such a high false-acceptance 
rate was tolerable, but other applications, 
such as AABS, cannot function effectively 
without a much greater discrimination 
ability. 

Both the directory and AABS rely on 
isolated-word recognition—recognition 
of a single word or phrase spoken in isola¬ 
tion. In the laboratory, we are currently 
pursuing more advanced techniques that 
can recognize continuous speech. For 
example, we have developed a prototype 
connected-word recognizer, which is also 
speaker independent and works over the 
telephone network. 

Certain applications require much larger 
vocabularies. The University of Quebec’s 
National Scientific Research Institute in 
Telecommunications (INRS-Telecommu- 
nications) has developed a speaker-adap¬ 
tive isolated-word recognition system 
capable of recognizing a vocabulary of 
86,000 English words with 93-percent ac¬ 
curacy. 2 Because this system is based on 


phonemes (the smallest distinguishing 
sound units of speech), a new user need not 
train it on the entire vocabulary. The sys¬ 
tem can generalize from a short training 
script of 100 to 200 sentences (1,000 to 
2,000 words) to model the new user’s 
pronunciation of all 86,000 words. 

One application of the INRS-Telecom- 
munications recognition system is the 
Talkwriter, a voice dictation typewriter 
that enables a user to enter text into a 
computer by speaking instead of typing. 
To improve the usability of the Talk- 
writer, INRS-Telecommunications is en¬ 
hancing its recognition algorithm to allow 
continuous-speech input, eliminating the 
need for quarter-second pauses between 
dictated words. Earlier work using a 5,000- 
word vocabulary and continuous speech 
was done by Bahl, Jelinek, and Mercer. 3 
Lee, Hon, and Reddy use context-depen¬ 
dent allophone units to perform speaker- 
independent, continuous-speech recogni¬ 
tion of a 1,000-word vocabulary. 4 

Automated alternate 
billing service 

AABS is of interest to telephone compa¬ 
nies because it offers large potential sav¬ 
ings in operator time. In the United States 


in 1988 there were an estimated 579 mil¬ 
lion intraLATA* collect calls and 89 mil¬ 
lion intraLATA third-number-billed calls. 
An operator’s average work time (AWT) 
for a collect call is 34 seconds. If no veri¬ 
fication of billing acceptance is performed, 
the AWT for a third-number-billed call is 
25 seconds; with verification the AWT is 
43 seconds. The estimated cost per opera¬ 
tor work-second is $0.0103. If we consider 
only intraLATA collect calls, assuming 
85-percent automation and 1988 call vol¬ 
ume, the potential annual savings in AWT 
nationwide is more than $172,000,000. 
Each of the seven regional holding compa¬ 
nies would save about $24,600,000 per 
year. 5 These figures do not include the 
additional savings that would be realized 
from the automation of third-number¬ 
billed calls. 

The interactive voice technologies used 
in AABS are implemented on a special- 
purpose Northern Telecom system called 
the Voice Service Node (VSN), installed 
in the central telephone office. This equip¬ 
ment is connected to the toll switch via 
data links (X.25) and digital voice links 
(T-l), as illustrated in Figure 1. When a 0+ 
call (a call dialed with a preceding zero) 
arrives at the switch, it is sent to the VSN. 
The VSN plays a “bong” tone followed by 
spoken instructions (in digitally encoded 
speech) explaining the billing options. The 
caller indicates a billing method by using 
DTMF (dual-tone multifrequency, or 
touch-tone) signaling, that is, by dialing 
1-1 for collect, another telephone number 
for third-number billing, or a calling card 
number. 

A caller choosing collect or third-num¬ 
ber billing is asked to say his or her name, 
which the VSN captures and stores digi¬ 
tally. The VSN then asks the caller to wait 
while acceptance of charges is obtained 
from the billed party. Next, the VSN sends 
a message to the switch over the X.25 
control link, requesting that the switch 
outpulse the call to the billed party (the 
same as the called party in the case of 
collect calls). 

For example, suppose Daniele Archer 
wishes to call Joe’s Department Store, 
collect. She dials zero followed by the 
telephone number of Joe’s Department 
Store. She hears a bong tone followed by 
the prompt, “For collect calls, dial 1 -1. To 


*LATA (local access and transport area): a geographi¬ 
cal area roughly the size of an area code calling area 
over which local telephone operating companies carry 


36 


COMPUTER 

























Figure 2. The speech recognizer. 


charge this call to another number, dial the 
complete billing number now.” (The bill¬ 
ing number can be either another phone 
number or Daniele’s calling card number.) 
She wants to make a collect call, so she 
dials 1-1. The system says, “Please say 
your name.” She says, “Daniele Archer,” 
and the system records it digitally. The 
switch then originates a call to Joe’s De¬ 
partment Store. When Joe answers, 
Daniele hears the following dialogue be¬ 
tween the VSN and Joe (words in boldface 
are a digital playback of Daniele’s voice): 

Joe: Hello, Joe’s Department Store! 

VSN: This is Michigan Bell. You have a 
collect call from Daniele Archer. Will 
you pay for the call? 

Joe: What did you say? 

VSN: This is Michigan Bell. You have a 
collect call from Daniele Archer. Please 
answer the following question yes or no. 
Will you pay for the call? 

Joe: Yeah. 

VSN: Thank you. Please go ahead. 

Note that when Joe responds with a 
phrase outside the speech recognizer’s 
vocabulary (“What did you say?”), the 
recognizer correctly rejects it. Dialogue 
error paths are designed to be as graceful 
and helpful as possible. Various types of 
speech recognition errors—rejection, no 
speech, speech too long, speech too 
short—generally evoke different dialogue 
branches designed to guide the user to a 
successful outcome. We have found that 
good voice dialogue design is extremely 
important for the success of an interactive 
voice application. Dialogue design is an art 
that benefits from extensive experience 
combined with a perfectionist mentality. 

Architecture of the 
Voice Interface 

AABS uses the following interactive 
voice technologies: network-based, 
speaker-independent speech recognition; 
real-time digital recording (encoding) of 
speech; real-time digital playback (decod¬ 
ing) of speech; and detection and reception 
of DTMF signals. All these functions are 
performed by Northern Telecom’s Voice 
Interface component of the VSN. Each 


Voice Interface unit can handle six simul¬ 
taneous voice channels. Eight such units 
(with a ninth as a spare) provide the VSN 
with 48 voice channels. The Voice Inter¬ 
face was first used in the Bell Canada 976 
Directory trial. 

Each Voice Interface contains four 
printed-circuit cards: three voice cards and 
one general-purpose processor card, which 
acts as an interface to the AABS applica¬ 
tion software. Each voice card contains an 
application-specific integrated circuit, 
used to perform the dynamic time warping 
required for speech recognition (described 
in the following section). 

Speech recognition 
algorithm 

Our speech recognition algorithm in¬ 
volves three main steps: feature extraction, 
word endpoint detection, and template 
matching, as illustrated in Figure 2. Fea¬ 
ture extraction divides an incoming word 
into 12.75-millisecond frames and calcu¬ 
lates the spectral parameters of each frame. 
Word endpoint detection locates the begin¬ 
ning and end of the word to be recognized 
(called the unknown). Pattern matching 


then compares the spectral properties de¬ 
termined by the feature extraction process 
to the templates (or stored mathematical 
models) in the speech recognizer’s vo¬ 
cabulary to determine the closest match. 

Feature extraction. Feature extraction 
captures the acoustic elements that distin¬ 
guish one word from another but ignores 
certain differences in the way the same 
word is spoken by different callers. The 
process ignores pitch and absolute loud¬ 
ness, which vary from caller to caller for 
each word, and concentrates instead on the 
overall spectral shape and how it changes 
across the word. 

After dividing a word into frames, fea¬ 
ture extraction measures the spectral prop¬ 
erties of each frame. The sensitivity of the 
analysis mimics the human ear by exhibit¬ 
ing greater frequency resolution at lower 
voice frequencies. Telephone transmis¬ 
sion limits the range of voice frequencies 
for analysis. It also requires extra process¬ 
ing to minimize the effect of impairments 
(for example, additive noise and frequency 
distortion) introduced by numerous net¬ 
work connections and telephone set types. 

The acoustic features used in the current 
implementation of the speech recognizer 


August 1990 


37 

































Table 1. Rules for scoring recognition accuracy. 


Recognizer Input 


Recognizer Output 

Yes 

No 

Rejection 

Yes 

CA 

FA 

FR 

No 

FA 

CA 

FR 

Yes-equivalent 

CA 

FA 

CR 

No-equivalent 

FA 

CA 

CR 

Imposter 

FA 

FA 

CR 


CA = correct acceptance; FA = false acceptance; FR = false rejection; CR = 
correct rejection 


consist of mel-frequency cepstral coeffi¬ 
cients 6 together with their first time differ¬ 
ences. 7 The term mel-frequency means that 
the center frequencies of the channels, 
instead of being linearly distributed in 
frequency, are linearly spaced below 1,000 
Hz and logarithmically spaced above. This 
is intended as a rough approximation of the 
frequency discrimination properties of the 
ear. The mel-frequency cepstral coeffi¬ 
cients are calculated as follows: Every 
12.75 ms, a Hamming window of duration 
25.6 ms is applied to the input speech 
signal and a fast Fourier transform is used 
to compute a power spectrum. Power spec¬ 
trum points are combined into 20 mel- 
frequency channels by means of triangular 
weighting functions. Next, a log transfor¬ 
mation is applied to each of the 20 channel 
energies. Finally, a cosine transform is 
applied, yielding the mel-frequency cep¬ 
stral coefficients 

C k = ^ Lcos(nnk/20 - 1/2) 

where the L’s are the 20 log channel 
energies, the C’s are the cepstral coeffi¬ 
cients, and k takes on integer values be¬ 
tween 0 and 7. 

Endpoint detection. To detect the be¬ 
ginning and the end of the unknown word 
or phrase, endpoint detection distinguishes 
speech from background noise. It differen¬ 
tiates the two by using a complex set of 
thresholds and rules to analyze changes in 
loudness. The thresholds automatically 
adjust to the diverse signal levels and noise 
impairments encountered on telephone 
networks. 

Noise poses a significant challenge in 
endpoint detection because many sounds 
(such as those produced by the letter/and 
other fricative phonemes) embedded in 


words resemble telephone noise. Another 
difficulty is distinguishing the silences 
contained within words (for example, dur¬ 
ing the closure of the p in spin) from the 
silences that occur at the completion of a 
word or phrase. The endpoint detector’s 
thresholds and rules not only help differen¬ 
tiate speech from noise, they also distin¬ 
guish intraword silence from phrase-final 
silence. The technique used is similar in 
spirit to that described by Lamel et al. 8 

Template matching. The template¬ 
matching process is at the heart of the 
speech recognition technique. It measures 
the similarity between the unknown and 
each of the templates in the active vocabu¬ 
lary. 

To develop templates for the speech 
recognizer’s vocabulary, we have col¬ 
lected sample words, called tokens, from 
tens of thousands of English-speaking men 
and women across the United States and 
Canada, in a variety of dialects, over a 
variety of telephone handsets, and over 
numerous long-distance and local connec¬ 
tions. Then we divided the tokens of each 
word into clusters, each cluster represent¬ 
ing similar pronunciations, and one tem¬ 
plate was generated from each cluster. 
Together, the templates represent a wide 
range of pronunciations for a particular 
word. The speech recognizer compares 
each unknown against the templates to find 
the closest match. 

During template matching, the template 
and the unknown must be properly aligned 
in time before a similarity score can be 
obtained. The process of time alignment is 
called dynamic time warping. 9 Dynamic 
time warping addresses the fact that differ¬ 
ent speakers not only pronounce a word at 
different speeds but also elongate different 
parts of the word. Dynamic refers to a 
technique known as dynamic program¬ 


ming, which determines the optimal piece- 
wise shrinking and stretching of an un¬ 
known to match it to a template. Time 
warping is the treatment of the time axes of 
the unknown and the template as if they 
were elastic bands—stretching and shrink¬ 
ing different parts of the unknown and the 
template to maximize the overall similar¬ 
ity score. 

Template matching requires extensive 
computation. Each unknown is compared 
with the total active vocabulary of the 
speech recognizer, with from five to 200 
templates representing the various pronun¬ 
ciations of each vocabulary word. The top 
10 choices go through a second matching 
process, which uses a different feature 
extraction strategy—one that is more ro¬ 
bust against telephone network impair¬ 
ments but is less precise in discriminating 
between words. The results of the two 
strategies are compared, and if they dis¬ 
agree, the result with the larger similarity 
score ratio is chosen. 10 The similarity score 
ratio is sjs 2 , where s, is the similarity score 
of the top-choice word and ,s, is the similar¬ 
ity score of the second highest scoring 
word. 

An application-specific chip, mentioned 
earlier, cost-effectively and reliably im¬ 
plements the template-matching algo¬ 
rithm. Developed by means of the silicon 
user design system and CMOS (comple¬ 
mentary metal-oxide semiconductor) fab¬ 
rication technology, the chip performs 
over 1,000 matches a second. 

The outcome of the speech recognition 
algorithm is an ordered list of matching 
vocabulary words, with associated simi¬ 
larity scores, for each unknown. The simi¬ 
larity scores indicate the speech 
recognizer’s confidence in a chosen tem¬ 
plate and help to identify unknowns that 
are not in its vocabulary. 

Accuracy of the 
recognizer 

Although the recognizer’s vocabulary 
for AABS consists only of the two words 
yes and no, under real-world conditions 
users do not always stay within the con¬ 
straints of that vocabulary. Potential re¬ 
sponses to a yes-or-no question such as 
Will you pay for the call? include Yes, 
ma’am\ Yes, 1 will ; Yeah ; Yup; What?-, 
Who’s this?; Hold on a minute; No, ma’am-, 
Mommy!; No way; No, thank you; and 
many others. 

The recognizer can do one of three 
things with the input utterance: (1) accept 


38 


COMPUTER 
















False acceptance rate (%) 


Figure 3. Trade-off between false acceptance and false rejection for successive refinements of the speech recognizer, based 
on a 5,021-token test set. 


it as yes, (2) accept it as no, or (3) reject it 
as outside the valid vocabulary or as too 
close to call between the valid choices. 
When a rejection occurs, the caller is 
reminded of what the valid input vocabu¬ 
lary is and given a second chance to speak 
a word from it, as in the dialogue example 
given earlier. 

Four possible outcomes are used to score 
the recognizer. If the recognizer accepts an 
input utterance (for example, recognizes it 
as no), the outcome is labeled a correct 
acceptance (CA) if the classification is 
correct (that is, the input was no) or a false 
acceptance (FA) if the classification is 
incorrect (the actual input was yes or who 
is this?). If the recognizer rejects the input 
utterance, the outcome is labeled a correct 
rejection (CR) if the input was an imposter 
(a word from outside the vocabulary, such 
as what) or as a false rejection (FR) if the 


input was a valid yes or no. 

Certain forms of yes (such as yeah, yeh, 
yuh, yup, ye, yop) and no (nope, naw, nah, 
neh) are considered sufficiently close to 
the word in question to be mandatory ac¬ 
ceptances. Such forms are treated the same 
as yes and no. In other words, they are 
counted as false rejections if they are re¬ 
jected. 

This leaves the problem of what to do 
with expressions such as yes, ma’am, 
which carry the meaning of yes and even 
have the word embedded in them. We 
created a special class of such phrases, 
which we refer to as yes-equivalents. 
Similarly, no-equivalents are phrases that 
mean no and have a phonetic sequence 
resembling no embedded in them—for 
example, no, I will not. 

For scoring purposes, we have chosen to 
consider yes- and no-equivalents as op¬ 


tionally rejectable. That is, they are 
counted as correct rejections if rejected 
and as correct acceptances if correctly 
accepted (for example, yes, I will recog¬ 
nized as yes). Table 1 summarizes the 
scoring rules just described. For example, 
whenever the recognizer input is yes and 
its output is no, the outcome is counted as 
an FA. Recognition error rates are stated in 
terms of percent FA and percent FR. 

An obvious trade-off exists between FA 
and FR. The trade-off is controlled through 
a threshold parameter. Figure 3 shows a 
family of trade-off curves during the pe¬ 
riod in which we were fine-tuning the fea¬ 
ture extraction, training, and recognition 
algorithms (November 1988 through 
March 1989). The test set for each curve 
consists of 5,021 tokens of yes, no, and 
imposters sampled all over the United 
States and Canada (except Quebec) over 


August 1990 


39 










Equivalents 



Figure 4. Composition of the 5,021-token test set. 


Table 2. Confusion matrix for the March 1989 recognition experiment. 


Recognizer Input 


Recognizer Output 


Yes 

No 

Rejection 

Totals 

Yes 

1,336 

20 

83 

1,439 

No 

7 

1,522 

97 

1,626 

Yes-equivalent 

173 

3 

479 

655 

No-equivalent 

11 

181 

514 

706 

Imposter 

5 

3 

587 

595 

Totals 

1,532 

1,729 

1,760 

5,021 


long-distance, dialed-up connections. The 
composition of the test set, shown in Fig¬ 
ure 4, is 61 -percent yes and no, 27-percent 
yes-equivalent and no-equivalent, and 12- 
percent imposters. The curves in Figure 3 
show that it is possible to achieve an oper¬ 
ating point below 1-percent FA and 5- 
percent FR on national data such as these. 

Table 2 is a detailed confusion matrix 
for the March 1989 experiment using an 
operating point of approximately 1-per¬ 
cent FA and 5-percent FR. The different 
rows of the table correspond to what the 
caller actually said (yes, no, yes-equiva¬ 
lent, no-equivalent, or imposter) and con¬ 
stitute the input to the recognizer. The 
columns correspond to the three possible 


recognizer actions: accept as yes, accept as 
no, and reject. Each cell in the table shows 
the number of tokens out of 5,021 in which 
the specified input produced the specified 
recognizer output. For example, the first 
row of the table shows that out of 1,439 yes 
tokens, the recognizer classified 1,336 as 
yes and 20 as no and rejected the remaining 
83. The fifth row of the table shows that of 
the 595 times that callers said words out¬ 
side the recognizer’s vocabulary (impos¬ 
ters), the recognizer mistakenly recog¬ 
nized a yes five times and a no three times 
and correctly rejected the input 587 times. 

Results have been similar when real 
customers have used a speech recognition 
system to place collect and third-number¬ 


billed calls in the Grand Rapids, Michigan, 
LATA: The FA rate is less than 1 percent 
and the FR rate is less than 5 percent. An 
operating point at which the FA rate is 
lower than the FR rate was chosen because 
the perceived cost of a false acceptance is 
substantially higher than that of a false 
rejection. When a rejection occurs, the 
caller is given a second chance to respond; 
if a second rejection occurs, the caller is 
transferred to an operator. On the other 
hand, a false acceptance triggers an unde¬ 
sired action: billing for an unwanted call or 
termination of a desired call. 


O n May 5, 1989, at 6:59 a.m. in 
Grand Rapids, the first public- 
customer collect call was auto¬ 
mated by means of speech recognition. 
The call was from “Glenn” and worked 
perfectly. The system was put into full 
service on May 15, 1989. 

VSN systems have been deployed in 36 
sites in the Ameritech and NYNEX re¬ 
gions.* In an Ameritech study of 2,608 
tokens from public users, spoken in re¬ 
sponse to the first prompt for input, 24 
tokens (0.92 percent) were false accep¬ 
tances, while 47 tokens (1.8 percent) were 
false rejections." The remaining 2,537 
tokens were handled correctly. A bilingual 
version of the VSN is scheduled for intro¬ 
duction in Bell Canada this year. 

So far, customer satisfaction with 
AABS has been high. We feel that this is 
due to two factors: the attention to detail 
that went into the design of the voice dia¬ 
logue and the excellent rejection charac¬ 
teristics of the recognizer when confronted 
with imposters. We gained invaluable ex¬ 
perience through the 1988 Bell Canada 
976 Directory concept trial, which guided 
our thinking and forced us to focus on 
these two key issues. 

Potential future applications of speech 
recognition in the telephone network in¬ 
clude voice entry of telephone calling card 
numbers, catalog shopping order entry, 
voice control of voice and text message 
systems, automation of additional opera¬ 
tor services, control of subscriber calling 
features, voice dialing for mobile tele¬ 
phones, airline flight status information, 
frequent-flyer account status, real-time 
financial information retrieval, automated 


Ameritech is the parent company of Michigan Bell, 
Ohio Bell, Indiana Bell, Illinois Bell, and Wisconsin 
Bell. NYNEX is the parent company of New York 
Telephone and New England Telephone. 


40 


COMPUTER 



















switchboards, bank by phone, student 
course registration, and talking yellow 
pages. ■ 


Acknowledgments 

The INRS-Telecommunications research on 
very large vocabulary speech recognition was 
supported by the Natural Sciences and Engi¬ 
neering Research Council of Canada. 

The author wishes to thank the following 
colleagues, who made significant technical 
contributions to the work described in this ar¬ 
ticle: Greg Bielby, Doug Sharp, C.C. Chu, 
Andrew McGregor, Paul Boucher, Vishwa 
Gupta, Richard Jankowski, Georges Mony, 
Pierre Gendron, Ed Dermardiros, David Sloan, 
Rafi Rabipour, and Paul Mermelstein. 


References 

1. M. Lennig and P. Mermelstein, “First Pub¬ 
lic Trial of a Speech-Recognition-Based 
976 Directory,” Proc. Speech Tech '88, 
Media Dimensions, New York, Apr. 26-28, 
1988, pp. 291-292. 

2. L. Deng, M. Lennig, and P. Mermelstein, 
“Modeling Microsegments of Stop Conso¬ 
nants in a Hidden Markov Model-Based 
Word Recognizer,” J. Acoustical Society of 
America, Vol. 87, No. 6, June 1990, pp. 
2,738-2,747. 

3. L.R. Bahl, F. Jelinek, and R.L. Mercer, “A 
Maximum Likelihood Approach to Con¬ 
tinuous Speech Recognition,” IEEE Trans. 
Pattern Analysis and Machine Intelligence, 
Vol'. PAMI-5, Mar. 1983, pp. 179-190. 

4. K.-F. Lee, H.-W. Hon, and R. Reddy, “An 
Overview of the Sphinx Speech Recogni¬ 
tion System,” IEEE Trans. Acoustics, 
Speech, and Signal Processing, Vol. ASSP- 
38, No. 1, Jan. 1990, pp. 35-45. 

5. M. Lennig and M.L. Hanford, “Automating 
Operator Services with Speech Recogni¬ 
tion,” in Voice Processing: Cashing In on 
the Telephone Network (Proc. Probe Re¬ 
search Voice Processing Conf.), Probe 
Research, Inc., Morristown, N.J., 1988. 

6. S.B. Davis and P. Mermelstein, “Compari¬ 
son of Parametric Representations for 
Monosyllabic Word Recognition in Con¬ 
tinuously Spoken Sentences,” IEEE Trans. 
Acoustics, Speech, and Signal Processing, 
Vol. ASSP-28, No. 4, 1980, pp. 357-365. 

7. M. Lennig, P. Mermelstein, and V.N. 
Gupta, “Speech Recognition,” Canadian 
Patent No. 1 232 686, issued Feb. 9, 1988, 
Ottawa, Canada. 

8. L.F. Lamel et al., “An Improved Endpoint 
Detector for Isolated Word Recognition,” 


IEEE Trans. Acoustics, Speech, and Signal 
Processing, Vol. ASSP-29, No. 4, Aug. 
1981, pp. 777-785. 

9. M.J. Hunt, M. Lennig, and P. Mermelstein, 
“Use of Dynamic Programming in a Syl¬ 
lable-Based Continuous Speech Recogni¬ 
tion System,” in Time Warps, String Edits, 
and Macromolecules: The Theory and 
Practice of Sequence Comparison, D. 
Sankoff and J. Kruskal, eds., Addison- 
Wesley, New York, 1983, pp. 163-187. 

10. V.N. Gupta, M. Lennig, and P. Mermel¬ 
stein, “Decision Rules for Speaker-Inde¬ 
pendent Isolated Word Recognition,” Proc. 
1984 IEEE Int’l Conf. Acoustics, Speech, 
and Signal Processing, IEEE, Piscataway, 
N.J., 1984, pp. 9.2.1-9.2.4. 

11. R.W. Bossemeyer, E.C. Schwab, and B.A. 
Larson, “Automated Alternate Billing 
Services at Ameritech,” J. American Voice 
HO Society, Vol. 7, Mar. 1990, p. 50. 



Matthew Lennig is manager of interactive 
services for Bell-Northern Research, with re¬ 
search and development responsibilities in the 
speech and image processing areas. Previously, 
as manager of interactive voice systems, he was 
responsible for development of the speech tech¬ 
nology component of Northern Telecom’s auto¬ 
mated alternate billing service, described in this 
article. Earlier he was manager of speech sys¬ 
tems, responsible for development of Bell 
Canada’s speech-recognition-based 976 Direc¬ 
tory and for algorithmic research in speech 
recognition, speaker verification, and speech 
synthesis. 

Since 1981 Lennig has been a visiting profes¬ 
sor at the University of Quebec’s INRS- 
Telecommunications, where, from 1986 to 
1989, he headed an NSERC-funded research 
project on very large (86,000-word) vocabulary 
speech recognition. 

Lennig graduated summa cum laude from 
Princeton University in 1974 in an independent 
concentration combining mathematics, linguis¬ 
tics, computer science, and electrical engineer¬ 
ing. He received a PhD in linguistics in 1978 
from the University of Pennsylvania, which he 
attended as a National Science Foundation fel¬ 
low, and an MEng in electrical engineering 
from McGill University in 1984. He is a mem¬ 
ber of the IEEE Computer Society. 


The author can be contacted at Bell-Northern 
Research and INRS-Telecommunications, 3 
Place du Commerce, Nuns’ Island, Montreal, 
Quebec, Canada H3E 1H6. 


Late Magazines? 
No Magazines? 
Membership 
Status Problems? 
No Answers 
To Your 
Complaints? 

Let your 
Computer 
Society 
Ombudsman \ 
cut 

through 
the red 
tape 
for you* 


IEEE Computer Society 
10662 Los Vaqueros Circle 
PO Box 3014 
Los Alamitos, CA 
90720-1264 



DIRECTOR OF 

COMPUTING 

SYSTEMS 

Carnegie Mellon University is looking 
for a technical leader for its outstand¬ 
ing campus computing. The Director 
of Computing Systems is responsible 
for network based services, central 

highly distributed environment. We 
are searching for somebody who can 
combine innovation with high quality 
production services, who can be the 
head of a major unit on-campus and an 
external representative for the 

The director must have a very strong 
background in advanced computing. 
Formal qualifications are less impor¬ 
tant than proven leadership and the 
ability to gain the respect of a highly 
technical community. 

Send your resume to or contact for 


Carnegie Mellon 
A University 

fnriWru* 5000 Forbes Avenue 

» * .! V r Pittsburgh, PA 15213 

Mell A» (412)268-2122 


August 1990 


itive Action/EOE 














The joy of C-scape 


T he C-scape™ Interface 

Management System is a flexible 
library of C functions for data entry 
and validation, menus, text editing, 
context-sensitive help, and windowing. 
C-scape’s powerful Look & Feel™ 
Screen Designer lets you create full- 
featured screens and automatically 
generates complete C source code. 

C-scape includes easily modifiable high- 
level functions as well as primitives to 
construct new functions. Its object- 
oriented design helps you build more 
functional, more flexible, more portable, 
and more unique applications—and 
you’ll have more fun doing it. 

The industry standout. Many 
thousands of software developers world¬ 
wide have turned to the pleasure of 

C-scape. The press agrees: 
^ / yyf “C-scape is by far the best. 




. A joy to use,” wrote 
IEEE Computer. Mqjor 
companies have selected C-scape as a 
standard for software development. 
C-scape’s open architecture lets you use 
it with data base, graphics, or other C 
and C++ libraries. C-scape runs in text or 
graphics mode, so you can display text 
and graphics simultaneously. To port 
from DOS or OS/2 to UNIX, AIX, QNX, or 
VMS, just recompile. C-scape also 


Elegant graphics and text 

>r graphics mode. 

Object-oriented architecture. Add custom 
features and create reusable code modules. C • 
compatible. 

Mouse support. Fully-integrated mouse support for 
menu selections, data entry fields, and to move and 
resize windows. 

Portability. Hardware independent code. Supports 
DOS, OS/2, UNIX, AIX, VMS, others. Autodetects 
Hercules, CGA, EGA, VGA. Supports Phar Lap and 
Rational DOS extenders. 

Text editing. Text editors with word wrap, block 
commands, and search and replace. 

Field flexibility. Masked, protected, marked, 
required, no-echo, and named fields with complete 
data validation. Time, date, money, pop-up list, and 
many more higher-level functions; create your own. 
Windows. Pop-up, tiled, bordered and exploding 
windows; size and numbers limited only by RAM. 
Menus. Pop-up, pull-down, 123-style, or slug menus; 
create your own. 

Context-sensitive help. Link help messages to 
individual screens or fields. Cross reference messages 
to create hypertext-like help. 

Code generation. Build any type of screen or form 
with the Look & Feel™ Screen Designer, test it, then 
automatically convert it to C code. 

Screen flexibility. CaU screens from files at run 
time or link them in. Automatic vertical/horizontal 
scrolling. 

International support. Offices in Berlin, Germany, 
with an international network of technical companies 
providing local training, support and consulting. 


Trial with a smile, c-scape is 
powerful, flexible, portable, and easy to 
try. Test C-scape for 30 days. It offers a 
thorough manual and function reference, 
sample programs with source code, and 
an optional screen designer and source 
code generator. Oakland 
provides access to a 24- 
hour BBS, telephone servi¬ 
ces, and an international 
network of companies providing in¬ 
country support. No royalties, runtime 
licenses, runtime modules. After you 
register, you get complete library source 
code at no extra cost. 

Call 800-233-3733 (617491 7311 in 
Massachusetts, 206-746-8767 in Washing¬ 
ton; see below for International). After 
the joy of C-scape, programming will 
never be the same. 

DOS, OS/2 (Borland and Microsoft 
support): with Look & Feel, $499; library 
only, $399; UNIX, etc. start at $999; 
prices include library source. Training 
in Cambridge and Seattle each month. 
Mastercard and Visa accepted. 


HI 


Oakland Group. Inc. 675 Massachusetts Ave., Cambridge, MA 02139 USA. FAX: 617-868-4440. Oakland Group, GmbH. Alt Moabit 91-B, D-1000 Berlin 21, F.R.G. 

(030) 391 5045, FAX: (030) 393 4398. Oakland International Technical Network (training, support, consulting): Australia Noble Systems (02) 564-1200; Benelux TM 
Data (02159) 46814; Denmark Ravenholm (042) 887249; Austria-Germany-Switzerland ESM 07127/5244; Norway Ravenholm (02 ) 448855; Sweden Linsoft (013) 111588; 
U.K. Systemstar (0992) 500919. Photo by Jessica A. Boyatt; Kaqji by Kqji Aso. Picture shows a C-scape program combining data entry with video images loaded from PCX 
files. C-scape and Look & Feel are trademarks of Oakland Group, Inc.; other trademarks belong to their respective companies. Copyright © 1990, by Oakland Group, Inc. 
Features, prices, and terms subject to change. Rea<Jer Service Number 4 















Anser 

An Application of Speech Technology to 
the Japanese Banking Industry 


Ryohei Nakatsu 

Nippon Telegraph and Telephone 


I f a customer and a computer could ex¬ 
change information verbally over a 
telephone, the ordinary telephone set 
would become a computer terminal. To 
achieve that aim, researchers have devel¬ 
oped various prototype systems to recog¬ 
nize speech regardless of the speaker. So 
far, however, only a few systems have been 
applied practically. 

In Japan, Nippon Telegraph and Tele¬ 
phone has combined speaker-independent 
speech recognition and speech synthesis 
technologies in a telephone information 
system called Anser (Automatic Answer 
Network System for Electrical Requests). 
Since its introduction in 1981, the system 
has provided information services for the 
banking industry. Most Japanese banks 
now use Anser, serving customers at a rate 
of several hundred thousand calls per day. 
The system has also been introduced in the 
securities industry. This article discusses 
Anser’s design, its current application in 
banking services, and the reasons for its 
success. 

System overview 

Anser’s voice response and voice recog¬ 
nition capabilities let customers make 
inquiries and obtain information through a 
dialogue with a computer. When Anser 


Anser combines 
speech recognition 
and synthesis to offer 
telephone banking 
services to millions of 
customers. New 
technology will soon 
make the system 
cheaper and expand 
its uses. 


was first developed in 1981, the system 
had only voice response capability and 
could accept input only from touch-tone 
telephones. Speech recognition was added 
by the end of that year, permitting system 
access through ordinary dial telephones. 
Later, facsimile and modem access capa¬ 
bilities were added. 

Figure 1 shows a typical Anser system 
configuration for a banking application. 
Audio response control equipment makes 

0018-9162/90/0800-0043$01.00 © 1990 IEEE 


the connection with the bank computer and 
performs such tasks as receiving, editing, 
and dispatching messages. Audio response 
equipment controls the speech recognition 
unit, the facsimile response unit, and the 
personal-computer control unit. The audio 
response equipment also automatically 
initiates calls to various kinds of terminals 
and responds to calls from customers. The 
facsimile response unit receives data, 
transforms it to written form, and sends it 
to facsimile terminals. The personal com¬ 
puter control unit contacts personal com¬ 
puters and transmits data between the 
computers and Anser. 

The Anser system is in place in more 
than 70 cities across Japan, with all Anser 
centers interconnected by a data communi¬ 
cations network. Customers can access an 
Anser center and obtain banking services 
for a small fee wherever they live. Anser’s 
main benefits are: 

• Charges for obtaining information 
over the system are low — approximately 
the same as for a local telephone call. 

• Installation expenses don’t burden the 
individual bank, because several banks can 
share the expense for Anser equipment. 

• System uniformity lets customers ac¬ 
cess any bank computer with the same 
access procedures and receive banking 
information in the same format from all 
banks. 


August 1990 


43 










Anser center 



Figure 1. Anser system configuration. 


Voice input 

_ 

Spectral analysis 


Phoneme-like 
template set 1 


Word templates 



Distance calculation 


Dynamic programming 
matching 


Word identification 1 


Dynamic 

time-warping 


Phoneme-like 
template set 2 

Weighting matrix 


Discrimination function 


Word identification 2 


~T 

Output 


Word 

discrimination 


Figure 2. Voice recognition method. 


Speech recognition 
and synthesis 

Speaker-independent speech recogni¬ 
tion (that is, the recognition of patterns 
regardless of the speaker) is particularly 
difficult through telephone lines because, 
in addition to variations among speakers, 
telephone sets and lines cause varying 
amounts of distortion. To simplify the 
manipulation of speech data, Anser incor¬ 
porates several original modifications of 
conventional speech recognition and syn¬ 
thesis technologies (explained below). 

The system’s 16-word lexicon consists 
of the 10 digits and six control words in 
Japanese. A huge amount of telephone 
speech with a wide range of telephone-set 
and line variations and speaker characteris¬ 
tics was collected into a speech database. 
The samples came from three regions of 
Japan and consist of 1,564 male and female 
speakers ranging in age from 20 to 60 years. 
From these, we used about 1,300 samples to 
generate 256 each of the phoneme-like and 
word templates. We used the remaining 
samples as test data. Our tests showed that 
the average recognition accuracy for the 
16-word vocabulary was 96.5 percent. 

The best known and most widely used 
method to analyze speech waves is based 
on linear predictive coding, in which 
speech features are expressed by a sequence 
of linear predictive parameter vectors. 

Recognition principle. In speech recog¬ 
nition, an input speech pattern is compared 
with each of several reference patterns. The 
reference pattern with the smallest mathe¬ 
matical distance (the least difference) from 
the input pattern is selected for recognition. 

To recognize speaker-independent tele¬ 
phone speech, recognition algorithms must 
cope with a wide range of variations. This 
entails collecting and analyzing a large 
amount of speech data from a wide variety 
of speakers, telephone sets, and telephone 
lines. Two types of algorithms have been 
proposed. One is based on a pattern-match¬ 
ing technique that uses multiple sequences 
of linear predictive coding parameters, 
called templates, to cover the variation 
range for each target word. The other 
method expresses the variation range as a 
statistical model, as with the popular hid¬ 
den Markov models. 1 

However, Anser adopts the conventional 
method for speaker-independent speech 
recognition, which is a combined pattern¬ 
matching method using dynamic time 
warping and multiple templates. 2 The 
major drawback of this method is that it 


COMPUTER 
















































































Figure 3. Conventional dynamic time warping. The lengths Figure 4. Staggered-array dynamic time warping. Open 
of the two illustrated paths are different. circles indicate points where dynamic time-warping calcula¬ 

tion is not necessary. 


uses many templates and thus requires 
many calculations. Anser has adopted 
several modifications that either cope with 
this calculation problem or improve the 
recognition performance. To reduce the 
number of calculations, the Split (strings 
of phoneme-like templates) method 3,4 is 
used to represent speech. The various 
speech sounds are precisely represented 
using about 100 parameter sets, called 
phoneme-like templates, which represent 
the variations that occur in each phoneme 
when it is uttered with other phonemes. 
Also, a variation of the dynamic time¬ 
warping method, called staggered-array 
dynamic programming, 5 reduces the 
amount of calculation, and a simple statis¬ 
tical method improves recognition. 

Recognition procedures. Figure 2 
shows the Split recognition process. Input 
speech is first converted into a sequence of 
linear predictive coding parameters. Then 
the parameters for each 15- to 20-millisec¬ 
ond speech frame are compared with the 
set of phoneme-like templates, and the pa¬ 
rameter string is converted into a distance 
matrix. Each word template in the lexicon 
is expressed.as a sequence of phoneme¬ 
like symbols. Both the Split method and 
staggered-array dynamic programming ef¬ 
fectively reduce the amount of calculation 
during dynamic time warping. 

Figure 3 illustrates conventional dy¬ 
namic time warping. For comparison, let’s 


represent the time sequences of two feature 
vectors as 

A = a,, a 2 ,. . . , a, 

B = bj, b 2 ,. . ., b j 

The distance between A and B is defined as 
£>(A, B) = min£* =] d(c k ) 

where F is a time-warping function 
between A and B and consists of lattice 
points on the A-B plane: 


where c, = (1,1), c K = (/,/), c k = ( ikjk ), and 
d(c k ) = d(a. k , b y4 ), that is, the distance 
between a j4 and b Jk . Then D{ A, B) can be 
effectively calculated using dynamic pro¬ 
gramming. 

When d(a ik , b . t ) is accumulated to obtain 
D( A, B) in the matching process of the 
original dynamic time-warping method, 
we must obtain d(a. k , b. t ) by calculating 
the distance value between the two vectors 

a. k and b. 4 . In the Split method, we obtain 
d(a. k , b t ) simply by accessing the distance 
matrix for d(a ik , p ; ), where p, is one of the 
phoneme-like templates corresponding to 
b # . This process greatly reduces the 
amount of distance calculation. 

Conventional dynamic time warping also 
requires a distance calculation for each point 
that corresponds to a combination of the 
input frame a. and reference pattern frame 

b. . Staggered-array dynamic programming 
decreases the amount of calculation by 


permitting only two kinds of dynamic 
time-warping paths (see Figure 4), thus 
reducing by two-thirds the number of 
points at which repetitious calculation is 
carried out. Various recognition experi¬ 
ments have confirmed that the reduced 
number of calculations have little effect on 
accuracy. 5 

Staggered-array dynamic programming 
has two advantages. First, it eliminates the 
path-length mismatch inherent in the con¬ 
ventional method (see Figure 3) by permit¬ 
ting only two kinds of symmetrical paths. 
Second, it allows for endpoint ambiguity 
in the input speech. The determination of 
word endpoints is especially difficult for 
telephone speech, where there is always 
some line noise, making tolerance of 
ambiguity particularly helpful. 

The above process calculates the dis¬ 
tance between the matrix and each word 
template and takes the template with the 
smallest distance as the prime candidate. 
However, if the result obtained with dy¬ 
namic time warping is not reliable, whether 
because the distance value is not small 
enough or because the difference between 
the lowest and second-lowest distance 
values is not small enough, then word dis¬ 
crimination begins. Whereas dynamic time 
warping uses gross speech features to iden¬ 
tify words, word discrimination is based 
on statistical features. It compares specific 
speech features by transforming input 
speech into a sequence of phoneme-like 


August 1990 


45 




















(Arbitrary word) 
Kana string 



Figure 5. Speech synthesis method. Kana is a Japanese syllabary. 


templates and comparing the patterns 
based on the probability of transition from 
certain phoneme-like symbols to other 
symbols. The discrimination function is a 
weighted sum of the transition probabili¬ 
ties of phoneme-like templates between 
adjacent input frames. 

We generate these weights for each word 
category by performing statistical opera¬ 
tions on training speech data, which is a 
part of the collected speech data used to 
tune the system for actual services. In 
preliminary experiments, when dynamic 
time warping yielded a recognition error, 
word discrimination subsequently offered 
the correct candidate more than 50 percent 
of the time. Therefore, the order of the first 
two candidates is changed if the ratio of 
their discrimination values is less than an 
experimentally determined constant. 

Speech synthesis. Linear predictive 
coding parameters are also important in 
speech synthesis. 6 When only predeter¬ 
mined messages are to be synthesized, 
human speakers record the messages, 
which are converted into parameter se¬ 
quences. In the synthesis stage, required 


messages are reproduced from these para¬ 
meters in a process called “analysis syn¬ 
thesis.” When arbitrary messages are nec¬ 
essary, voice messages are generated 
based on phonetic and linguistic rules for 
speech synthesis. This process is called 
“synthesis by rule.” 

Anser uses line-spectrum pair parame¬ 
ters 7 as the synthesis parameters. These are 
one of the sets of feature parameters ex¬ 
tracted by linear predictive coding analy¬ 
sis. Because line-spectrum pair parameters 
are frequency parameters rather than time 
parameters, they can reproduce speech of 
almost the same quality as speech based on 
ordinary linear predictive coding, while 
using only half as many parameters. 

Anser uses the consonant-vowel-conso- 
nant syllable as a synthetic speech unit. 
Because the Japanese kana syllabary con¬ 
tains only vowels and consonant-vowel 
syllables, Japanese sentences consist al¬ 
most entirely of alternate occurrences of 
vowels and consonants. The consonant- 
vowel-consonant syllable is, therefore, the 
natural basic unit for synthesizing Japa¬ 
nese. Also, the consonant-vowel-conso¬ 
nant syllable allows synthesis based on the 


connection of basic units rather than 
words. Lastly, since the connection of 
consonant-vowel-consonant syllables oc¬ 
curs at the consonantal low-energy speech 
break, the connection causes very little 
distortion. 

Figure 5 diagrams the speech synthesis 
process. For fixed messages, the sequence 
of line-spectrum pair parameters corre¬ 
sponding to the message number is read 
from memory. For arbitrary messages, the 
input character sequence is transformed 
into consonant-vowel-consonant sequence 
symbols, and prosodic parameters are de¬ 
termined through syntactic analysis of the 
sentence to be synthesized. The parameter 
sequence corresponding to the consonant- 
vowel-consonant syllables is read from 
memory, and such parameters as the dura¬ 
tion of each syllable, pitch contour, and 
power are generated accordingly. The syl¬ 
lables are then connected. Finally, the 
parameters produced from the analysis- 
synthesis procedure and the synthesis-by- 
rule procedure are combined, and the data 
are fed into the line-spectrum pair synthe¬ 
sizer for conversion into speech. 

The system’s speech unit inventory 
comprises about 1,000 consonant-vowel- 
consonant syllable units plus about 400 
supplementary speech units, such as con¬ 
sonant-vowel, vowel-consonant, and 
vowel-vowel units. 

Experience with Anser 

Since its introduction, Anser has been 
applied mainly to banking services such as 
fund transfer confirmations or account 
balance inquiries. 

Anser provides two basic kinds of bank¬ 
ing services: notification services and re¬ 
sponses to inquiries. Notification services, 
which are initiated by a bank computer, 
automatically inform the user of deposits, 
automatic withdrawals, and other such 
debit and credit transactions. For dial and 
touch-tone telephone users, the computer 
calls and notifies the users via synthesized 
voice. Customers can also receive re¬ 
sponses to questions about account bal¬ 
ances, information about bank services, 
and guidance on bank procedures. 

Table 1 illustrates how the service works 
in a typical inquiry dialogue between a 
customer and Anser. First, the customer 
accesses an Anser center. The Anser sys¬ 
tem asks the customer to identify by num¬ 
ber the service desired, the bank branch, 
the account number, and the customer’s 
secret pass number. The system then sends 


46 


COMPUTER 




































the appropriate inquiry message to the 
customer’s bank computer. The bank sys¬ 
tem confirms the account and pass num¬ 
bers, composes an answer, and transfers it 
to the Anser system. Anser then converts 
the information into a synthesized voice 
response. 

Of Japan’s 612 financial institutions, 
583 (95 percent) either offer or are prepar¬ 
ing to offer Anser banking services. The 
number of customers using these services 
is increasing at a steady rate. The total 
average traffic per month is about 16 mil¬ 
lion calls, and the peak number of calls per 
day is more than 1 million. 

Figure 6 shows a recent breakdown of 
input and response types. During 1981, 
most users accessed the system through 
ordinary telephones; input was verbal or 
through key selection, and response was 
verbal. Since then, the use of facsimile 
response has increased rapidly, reflecting 
a general increase in facsimile’s popular¬ 
ity. Personal computer and teletext use are 
also expected to increase. 

Anser’s success. Why has the Anser 
system been such a success? One reason is 
that an automatic telephone information 
system was urgently needed. Many Japa¬ 
nese banks had been using human opera¬ 
tors to provide customer banking services. 
Customers want transaction information 
immediately, and transactions generally 
occur at month’s end, so each bank had to 
employ many operators to handle the 
monthly rush. 

Anser also has been successful because 
it adapted to the predominance of the ro¬ 
tary dial telephone. In 1981, more than 80 
percent of the telephone sets in Japan were 
rotary dial. Banks naturally demanded that 
an ordinary telephone set could be used as 
an Anser data terminal. Although the 
number of touch-tone telephones is rapidly 
increasing, about 70 percent of the tele¬ 
phones in Japan are still rotary dial. (Thus, 
there is a big potential for other telephone 
services that use speech recognition tech¬ 
nology.) 

Lastly, Anser has been successful be¬ 
cause it has transcended its original role as 
a speech response and recognition system 
to become a multimedia information sys¬ 
tem. In the present Anser system, voice 
input/output is used mainly for such notifi¬ 
cation services as fund transfer notices. 
Touch-tone telephones and personal com¬ 
puters are usually used for services in 
which security is important. If Anser had 
supported just voice communication, the 
service area would have been smaller. 


Table 1. Inquiry service process. 


Customer 

System 

(Call the center) 

(Detect the call) 

“Hello, this is the NTT bank telephone service center. 
What is your service number?” 

“One, one.” 

“You are asking for your account balance. What is 
your branch number?” 

“One, two, . . .” 

“What is your account number?” 

“Three, four, . . .” 

“What is your secret number?” 

“Five, six, . . .” 

“Your current balance is 153,000 yen. If you would 
like to have your balance repeated, please say ‘once 


more.’ If not, say ‘OK.’” 

“OK.” 

“Thank you very much.” 


Anser’s future 

The securities industry — where such ser¬ 
vices as transaction confirmations and bal¬ 
ance updates are as common as in the 
banking industry — has shown a very 
strong interest in Anser, which can also 
provide fund transfer services relating to 
buying and selling stocks and bonds. 

Although only 25 securities companies 
now use Anser, its growth is remarkable. 
Between January 1987 and January 1988, 
usage increased from fewer than 1 million 
calls a month to more than 4 million. 

Personal Anser system. Nippon Tele¬ 
graph and Telephone realized that extend¬ 
ing the market for voice information ser¬ 
vices required reducing the size and cost of 
the speech recognition unit. NTT devel¬ 
oped a single board with full speaker-inde¬ 
pendent speech recognition functions. The 
board’s key component is an application- 
specific large-scale integrated circuit, 8 
which performs the necessary distance 
calculation and executes dynamic time 
warping. 

Figure 7 shows the single-board recog¬ 
nition system, which consists of a pulse- 
code modulation coder/decoder, a general- 
purpose digital-signal processor, the appli¬ 
cation-specific LSI circuit, 128 Kwords of 
data memory, and interface circuitry link¬ 
ing the board to the host computer. The 
digital-signal processor analyzes input 
speech by linear predictive coding and 


performs speech detection. Phoneme-like 
templates and multiple-word templates are 
loaded through the host computer and 
stored in the data memory. The board is 
easily operated by five commands through 
the host processor: noise measurement, 
recognition, termination, data transfer, and 
hardware examination. 

The new board makes the Anser system 
more compact and economical. Conven¬ 
tional Anser recognition equipment re¬ 
quires that each line has a recognition cir¬ 
cuit board three times the size of a sheet of 
A4 paper. The new board reduces the 
equipment size by 75 percent. Combining 



August 1990 


47 










Figure 7. Recognition board block diagram. 


this board with a personal computer, a text- 
to-speech unit, and a network-control unit 
creates a personal telephone information 
system. 

New services. Anser’s 16-word lexicon 
has restricted the system to banking and 
similar services. Introducing such services 
as reservations and telephone shopping 
will require more input commands — a 
larger vocabulary and continuous speech 
input capabilities — even with touch-tone 
telephones. These capabilities can be pro¬ 
vided through new technologies, such as 
hidden Markov models. Such statistical 
models offer much more sophisticated 
performance with a large speech database. 
A system that recognizes more than 100 
words should be introduced into applied 
service within two or three years. Recogni¬ 
tion of continuous speech will also be 
possible in the near future. 


T elephone-based services have 
been the largest market for speech 
recognition technology, and they 
will continue to dominate in the future as 
new technologies refine and expand the 
available services. The Integrated Services 
Digital Network will also help improve 
telephone speech recognition by reducing 
interference and distortion. ■ 


References 

1. L.R. Rabiner, S.E. Levinson, and M.M. 
Sondhi, “On the Application of Vector 
Quantization and Hidden Markov Models 
to Speaker-Independent, Isolated Word 
Recognition,” Bell Systems Tech. /., Vol. 
62, No. 4, Apr. 1983, pp. 1,075-1,105. 

2. L.R. Rabiner et al., “Speaker-Independent 
Recognition of Isolated Words Using Clus¬ 
tering Techniques,” IEEE Trans. Acous¬ 
tics, Speech, and Signal Processing, Vol. 
ASSP-27, No. 4, Aug. 1979, pp. 336-349. 

3. N. Sugamura, K. Shikano, and S. Furui, 
“Isolated Word Recognition Using Pho¬ 
neme-Like Templates,” IEEE Int'l Conf. 
Acoustics, Speech, and Signal Processing, 
1983, pp. 723-726. 

4. T. Nomura and R. Nakatsu, “Speaker-Inde¬ 
pendent Isolated Word Recognition for 
Telephone Voice Using Phoneme-Like 
Templates,” IEEE Int’l Conf. Acoustics, 
Speech, and Signal Processing, 1984, pp. 
2,687-2,690. 

5. K. Shikano and K. Aikawa, “Staggered 
Array DP Matching,” Trans. Committee on 
Speech Research, J. Acoustical Soc. Japan, 
Vol. S82, No. 15, July 1982, pp. 113-120 
(in Japanese). 

6. K. Hakoda et al., “Japanese Text-to-Speech 
Synthesizer Based on Residual Exited 
Speech Synthesis,” IEEE Int'l Conf. Acous¬ 
tics, Speech, and Signal Processing, 1986, 
pp. 2,431-2,434. 

7. N. Sugamura and F. Itakura, “Speech Data 
Compression by LSP Speech Analysis and 
Synthesis Technique,” IECEJ Trans., Vol. 
J64-A, No. 8, Aug. 1981, pp. 599-605 (in 
Japanese). 


8. S. Mikiand K. Intoh, “Speaker-Independent 
Isolated-Word Recognition LSI,” IEEE 
Int’l Conf. Acoustics, Speech, and Signal 
Processing, 1989, pp. 793-796. 


__ 

Ryohei Nakatsu is executive manager of the 
Research Planning Division at Nippon Tele¬ 
graph and Telephone Basic Laboratories. He 
joined NTT’s Electrical Communications 
Laboratories in 1971, working in speech recog¬ 
nition technology and taking part in develop¬ 
ment of the Anser system. 

Nakatsu received his BS, MS, and PhD de¬ 
grees in electronics and information engineer¬ 
ing from Kyoto University in 1969, 1971, and 
1982, respectively. He is a member of the IEEE 
Acoustics, Speech, and Signal Processing Soci¬ 
ety, the Acoustical Society of Japan, and the 
Institute of Electronics, Information, and 
Communication Engineers of Japan. 

Readers can contact Nakatsu at the Research 
Planning Division, NTT Basic Research Labo¬ 
ratories, 9-11, Midori-cho 3-Chome, Musash- 
ino-shi, Tokyo, 180 Japan. 


COMPUTER 



















































TASC.. .ALREADY INTO THE FUTURE 



Since 1966, TASC has provided government, industry and commerce with a broad spectrum of computer-based 
analytic services, system engineering, and software products. Our special place among companies that "manufacture 
knowledge" is a result of the expertise, commitment and achievements of our Technical Staff, and our ability to antici¬ 
pate and plan for future technologies. 

TASC's OFFICE OF THE TECHNICAL DIRECTOR/STRATEGIC SCIENCES GROUP promotes technical innovation and 
the rapid application of new or immature technologies. Major areas of focus are applications of supercomputing tech¬ 
nologies, professional and information services in satellite remote sensing, and applications of advanced submarine 
technologies. The growth and increasing technical challenges within this office have created three senior-level posi¬ 
tions, reporting to the TECHNICAL DIRECTOR. 

SENIOR IMAGE SCIENTIST 

Requires a PhD in Electrical Engineering, Computer Engineering, or Computer Science; a solid blend of related acad¬ 
emic, industrial and/or research and development experience; and a strong record of technical publications and pro¬ 
fessional activities. This person will conceive, develop and apply state-of-the-art imagery technologies in such areas as 
artificial intelligence, earth sciences, non-destructive evaluation and military applications. Other key responsibilities 
include marketing, proposal activities and business development related to new research and applications for advanced 
image technology. 

SENIOR COMPUTING SCIENTIST 

Requires a PhD in Computer Science, proven professional standing in the computing science field, and a record of 
achievement in the academic, industrial and/or research and development worlds. The person selected for this key 
position will conceive, develop and apply the latest computing technologies within such areas as artificial intelligence, 
image processing, graphics and parallel computing. Additional responsibilities include contributing direction and 
guidance to TASC’s well-regarded Computing Technology Center and our Computing Resources Steering Group, and 
assuming a key role in marketing, proposal and business development activities in the advanced computing fields. 

SENIOR REMOTE SENSING SCIENTIST 

Requires a PhD in Electrical Engineering, or Computer Science, ten years’ experience in analysis of satellite remote 
sensing systems, and demonstrable professional standing in the remote sensing community. This person will develop 
models and analytic techniques for land, ocean, and atmospheric remote systems and data, with emphasis on current 
data sets (SEASAT SAR, SPOT, LANDSAT or NOAA-AVHRR) and future systems, such as imaging spectrometer, sounders, 
and radar. Other key responsibilities include new business planning, marketing presentations, and proposal development. 
TASC offers an environment unsurpassed for individual and team accomplishment in leading-edge technol¬ 
ogies. Our corporate history is one of steady expansion blended with notable stability. TASC’s salary/benefit/ 
profit sharing program and the sensitivity of our salaries to achievement enable us to amply recognize the 
contributions of our Technical Staff. Please write to Barbara C. Dougherty. 



THE ANALYTIC SCIENCES CORPORATION 


55 Walkers Brook Drive 
Reading, MA 01867 

An Equal Opportunity Employer. M/F. U.S. Citizenship Required. 




























Augmenting a Window 
System with Speech Input 


Chris Schmandt, Mark S. Ackerman, and Debby Hindus 
Massachusetts Institute of Technology 


D espite high expectations, there 
have been few convincing demon¬ 
strations of speech input in desk¬ 
top computing environments. We have 
focused on window systems, where speech 
might provide an auxiliary channel to 
support window navigation. 

Xspeak, our speech interface to the X 
Window System, associates words with 
each window. Speaking a window’s name 
moves it to the front of the screen and 
moves the cursor into it. Speech does not 
provide a keyboard substitute, but it does 
assume some of the functions currently 
assigned to the mouse. Thus, a user can 
manage a number of windows without 
removing his or her hands from the key¬ 
board. 

We provided this interface to a group of 
student programmers who used it for sev¬ 
eral months. This pilot study was designed 
to identify some initial considerations for 
using speech recognition in workstations. 
The manner in which our programmers 
used voice pointed out its strengths and 
weaknesses. Recognition accuracy was 
critical, although some of our most enthu¬ 
siastic users had some of the poorest recog¬ 
nition scores. The most consistent users 
started to use more windows and to allow 
more overlapping of windows. Some users 
had already developed their own tech- 


With Xspeak, window 
navigation tasks 
usually performed 
with a mouse can be 
controlled by voice. 
A new version, 
Xspeak II, 
incorporates a 
language for 
translating spoken 
commands. 


niques, which our voice interface couldn’t 
help, for coping with multiple windows. 
Speech proved to be neither faster nor 
slower than the mouse, although the choice 
of which medium to employ was in part 
related to what else the user was doing with 
his or her hands. 


In a windowing environment, many 
applications support a direct-manipulation 
interface, where the user can click on but¬ 
tons, pull scroll bars, and so on. Our stu¬ 
dent users complained about lack of voice 
access to these mouse functions. This led 
us to develop a user interface specification 
language so that voice commands could 
interact with applications by generating a 
series of mouse-motion, button-press, and 
key-press events. To improve recognition, 
vocabulary subsets specific to an applica¬ 
tion can be enabled either by voice or by 
mouse motion into a window. 

The first part of this article gives some 
necessary background in speech recogni¬ 
tion and window systems, with an analysis 
of how they might be combined. The sec¬ 
ond part describes Xspeak, our first navi¬ 
gation application, including its operation 
and our field study of its use. The final 
section introduces Xspeak II, an improved 
version that includes a user interface speci¬ 
fication language, a rich tool for adding 
voice input to applications. 

Background 

Speech is a difficult input medium. Al¬ 
though speech recognition has received 
considerable positive publicity that has 


50 


0018-9162/90/0800-0050S01.0 


COMPUTER 














raised the expectations of interface design¬ 
ers, the available devices leave much to be 
desired, particularly in recognition accu¬ 
racy. Many variables affect error rates, 
including vocabulary size and composi¬ 
tion, users’ attitudes and speaking styles, 
ambient noise, and microphone type and 
placement. 1 - 2 Many of the high recognition 
rates reported are achieved by skilled users 
reciting lists of words in acoustically stable 
environments. This is very different from 
actual use in office environments where 
users may not be used to speech recogni¬ 
tion. 

Because of the difficulties in achieving 
high recognition accuracy, most success¬ 
ful applications use small vocabularies in 
amenable environments. Examples in¬ 
clude: 

• Inspections, such as noting defects of 
major appliances on a factory floor or 
testing circuit boards. The user’s hands 
remain on the target of inspection, and 
voice is used to record results. 

• Sorting, of either baggage or pack¬ 
ages, where the user’s hands are busy 
and voice is used to specify routing. 

• Visual monitoring, especially with 
microscopes, for inspection in inte¬ 
grated-circuit and biomedical applica¬ 
tions. The user’s eyes are accommo¬ 
dated to the task, and the user’s mouth 
may be in a stable position for a micro¬ 
phone. 

These situations benefit from voice pri¬ 
marily because the user’s hands and eyes 
are otherwise occupied. 

The role of speech recognition in desk¬ 
top computing is not so well established. 
There is little conclusive evidence that 
speech is superior to the keyboard for data 
entry, much less for free-form typing and 
editing. (For an excellent survey of the 
literature, see Martin. 3 ) Much of the cur¬ 
rent work in large-vocabulary speech rec¬ 
ognition is biased toward development of 
the so-called “listening typewriter,” an 
automatic transcription device for busi¬ 
ness correspondence. Yet word processing 
may comprise only a fraction of a user’s 
computer activities. And, although memo¬ 
rized text can be spoken as much as 500 
percent faster than it can be written, dicta¬ 
tion is not necessarily faster. Because 
composition requires the bulk of a writer’s 
time, dictation may increase speed by only 
20 to 65 percent. 4 

When might speech input be useful in a 
workstation? The evidence suggests that 
voice input is more valuable in conjunction 
with other input devices (such as keyboard 


and mouse). Judging by the successful 
industrial applications of speech recogni¬ 
tion, in which the user performs an activity 
in parallel, we surmised that allowing users 
to remain focused on the screen and key¬ 
board, instead of fumbling for the mouse, 
would be beneficial in a workstation envi¬ 
ronment. 

To the extent that the tasks of navigation 
and interaction with the applications are 
separable, a performance improvement 
might be expected by splitting the input. 
Cognitive experiments have shown that a 
person’s ability to perform multiple tasks 
is affected by whether those tasks use the 
same or differing modes, for example, 
spatial and verbal modes. 5 ’ 6 Such observa¬ 
tions led Martin 3 to design an experiment 
using speech recognition for an alternate 
input channel in a CAD system employing 
both keyboard and mouse. Her subjects 
were indeed more productive with the 
addition of voice. She attributed this in part 
to the speed of speech recognition versus 
typing long command names and in part to 
the ability of users to split attention across 
channels, that is, to remain visually fo¬ 
cused on the screen while using spoken 
commands. 

Martin’s second finding suggested the 
expected utility of speech as an interface to 
a visually complex window system. Mov¬ 
ing between tasks, that is, between win¬ 
dows, is normally accomplished by using 
the mouse to move a cursor. This requires 
both manual and visual attention. Apply¬ 
ing the divided-attention hypothesis and 
using different input channels for different 
classes of tasks might enhance navigation 
between windows. 

Window systems. Windows are now 
commonplace on bitmapped computer 
workstations. Window systems allow the 
screen to be divided into a number of re¬ 
gions, with each region allocated to input 
or output from a particular computer pro¬ 
cess or program. Because windows are so 
ubiquitous and are indeed the substrate on 
which so much workstation use is based, 
we felt that no research into speech and 
user interfaces should ignore them. We 
chose to work with the X Window System 
because it is a de facto standard across 
workstations. 

The X Window System defines a stan¬ 
dard way for application programs, or X 
clients, to communicate with a separate 
process, the X server, that controls screen 
display and handles user input. Servers are 
typically provided by a hardware manufac¬ 
turer. X clients include applications such 


as Xterm, a terminal emulator; Xclock, a 
clock; and Emacs, a programming editor. 

Window managers, a specialized type of 
client, control the placement of application 
windows on the screen, usually through 
user control. They can vary a great deal in 
X, and because window managers are just 
applications in X, they can be used inter¬ 
changeably. (For a taxonomy of window 
managers, see Myers. 7 ) 

Two characteristics of window manag¬ 
ers are important for our purposes. First, 
they may be tiling or overlapping. With an 
overlapping window manager, windows 
can partially obscure one another; with a 
tiling window manager, they cannot. Sec¬ 
ond, window managers differ according to 
how they select which window receives 
keystrokes. The mechanism for shifting 
input focus may be click to focus (some¬ 
times called “sticky”), requiring a mouse 
button click within a window before key¬ 
strokes are accepted. Or the window man¬ 
ager may be real-estate based and auto¬ 
matically shift the focus to the window 
where the mouse pointer appears. The 
window managers selected by our users 
were all overlapping and real-estate based; 
none were modified. 

Window systems and speech recogni¬ 
tion. Where, then, can speech be most 
profitably employed in a window system? 
And what functionality should it augment? 

Before placing a speech interface within 
a windowing system, we had to consider 
how window systems are used. But there 
has been surprisingly little study of this or 
why users prefer a particular interface. 
Gaylin 8 discusses frequency of use of some 
window operations. Card, Pavel, and Far¬ 
rell 9 provide a loose taxonomy of how 
windows are used in tasks. More important 
for our purpose was Bly and Rosenberg’s 10 
comparison of tiled and overlapped win¬ 
dows in a task that involved searching for 
information between windows. When the 
amount of text to be searched was not 
entirely visible, they found that overlap¬ 
ping windows were more effective than 
tiled windows, with an interesting bimo¬ 
dality. For the most-experienced users, 
overlapping windows were faster, but for 
some less-experienced users, they were 
significantly slower. Bly and Rosenberg 
attributed this to the added navigational 
tasks of manipulating the various win¬ 
dows. However, despite this added cogni¬ 
tive load, their users preferred overlapping 
windows. 

Window systems force use of a spatial 
metaphor. Users organize their windows 


August 1990 


51 








geometrically, perhaps stacking them in 
layers. Visually, it is relatively simple to 
recognize a window when there are few 
windows and each is in a distinct geomet¬ 
ric position. But, as the number of win¬ 
dows increases, it becomes progressively 
more difficult to find a window through 
visual inspection. Moreover, the mouse, a 
two-dimensional spatial input device, is 
not matched to the two-and-a-half dimen¬ 
sions of overlapping windows. (Tiled 
windows, if they are stacked in layers, may 
also have planes.) As the number of win¬ 
dows grows, using the mouse to interact 
with a “buried” window becomes more 
difficult. A window with no part exposed 
may be inaccessible to the mouse until 
other windows are moved out of the way. 

Speech offers an alternative. Voice, not 
being tied to a spatial metaphor, can inter¬ 
act with windows directly, regardless of 
their degree of visual exposure. Speech, 
then, could let users employ many task- 
specific windows. Furthermore, naviga¬ 
tion is a good candidate for optimization 
via the use of multiple input channels. 

The above suggests that in a complex 
window environment, especially with us¬ 
ers who would like to create many win¬ 
dows, an interface designed to improve 
navigation would provide faster access to 
various windows. Therefore, navigation 
was our prime candidate for a speech inter¬ 
face. Further, to the extent that navigation 
could be differentiated as a separate task 
from the activities occurring within each 
window, multimodal input might lessen 
the user’s cognitive load. This could allow 
successful use of a larger number of win¬ 
dows dedicated to specific tasks. 

Xspeak 

Xspeak is an application, not a window 
manager, that allows voice access to win¬ 


dows in the X Window System. It runs on 
Sun workstations (it should run with any X 
Windows server), using a Texas Instru¬ 
ments speech card in a PC-based audio 
server." We did not modify the X server, 
the user’s window manager, or any other 
application. 

Xspeak associates windows with voice 
templates , words trained and stored in the 
recognizer and constituting its vocabulary. 
Speaking a window’s template pops the 
window to the foreground and moves the 
mouse pointer to the middle of the win¬ 
dow. The window manager, which does 
not distinguish this motion from mouse 
motion, shifts the input focus to the appro¬ 
priate window. At this point, keystrokes 
are directed to the application running 
within the window. Figure 1 shows this 
interprocess communication. 

Xspeak also allows users to fully lower 
or raise the current window. Users dan 
move between windows and rearrange 
them without removing their hands from 
the keyboard. However, the mouse re¬ 
mains the sole means for moving and re¬ 
sizing windows. These operations, which 
are much less frequent than navigational 
operations, 8 are cumbersome to perform 
with voice commands. 

Providing a speech interface to the win¬ 
dow system was relatively straightfor¬ 
ward. Because all applications are in sepa¬ 
rate processes from the X Windows server, 
moving a top-level window does not cor¬ 
rupt any application’s data structures. 
Xspeak would have been much more diffi¬ 
cult to implement in a window system that 
does not separate server and client. 

In Xspeak, a configuration file associ¬ 
ates window titles with the template num¬ 
bers in the recognizer. Xspeak associates 
the window title (the window name prop¬ 
erty set by the application on its top-level 
window) with a particular window ID, 
which is used to modify the window stack¬ 


ing order. Xspeak also lets users name new 
windows not found in the configuration 
file. To do this, the user clicks on the 
window being named so that Xspeak can 
determine the window ID. The user then 
speaks the new name, that is, he or she 
trains a recognizer template. The configu¬ 
ration file also lets users start applications. 
If a window’s name is spoken and no 
matching window ID can be found, the rest 
of the corresponding entry in the configu¬ 
ration file is executed to create the new 
window. For example, in a configuration 
file, this line 

emacs -f emacs 

would make Xspeak create an Emacs win¬ 
dow if one did not already exist. 

Xspeak includes a graphical control 
panel (see Figure 2) that serves several 
functions. Its status display indicates to the 
user that the recognizer is working. When 
the user says a word, this panel displays the 
word or a message indicating that no word 
was recognized. This feedback is essential 
when recognition accuracy is low due to 
poor word training or increased back¬ 
ground noise. 

The control panel includes a button to 
invoke window naming. The user can dis¬ 
able or enable recognition, using another 
button, to avoid spurious recognition while 
conversing or answering the phone. From 
the Xspeak control panel, users can select 
the utility function (“util” in Figure 2) to 
access less frequently used commands. 
This panel contains commands to test, 
calibrate, and retrain the recognizer. Users 
attempting to improve recognition accu¬ 
racy frequently chose to retrain individual 
words. 

Microphones. We were unwilling to 
subject our users to head-mounted micro¬ 
phones because these microphones are 
uncomfortable and tend to slip. Also, if 
users forget they are wearing the micro¬ 
phone, they may be unpleasantly reminded 
when they attempt to drink coffee or an¬ 
swer the telephone. Instead we placed a 
super-cardiod microphone (Sennheiser 
ME-80) next to the workstation monitor. 
Our choice of microphones contrasts sig¬ 
nificantly with much other published work 
on speech recognition and resulted in poor 
recognition performance. It is more com¬ 
mon to use a head-mounted, noise-cancel¬ 
ing microphone to control microphone 
acoustics and compensate for background 

A more directional microphone might 
have decreased background noise from 


52 


COMPUTER 




















sources such as fans and telephones, but it 
would also have been more sensitive to the 
speaker’s position. To make our micro¬ 
phone work well, we had to change a 
number of internal parameters in the re¬ 
cognizer to deal with the higher noise level. 
(Although these internal parameters are 
documented, a systems developer unfamil¬ 
iar with the operation of speech recogniz¬ 
ers probably couldn’t decipher them, much 
less optimize their values.) 

Not using noise-canceling microphones 
tends to cause insertion errors, that is, 
picking up of background noise as speech. 
Recognizers are generally poor at discrimi¬ 
nating whether a particular word is within 
their universe of templates. The conse¬ 
quence of insertion errors is window re¬ 
configuration; suddenly, user input goes to 
the wrong window (especially annoying if 
keyboard noise caused the error). Thus, we 
set the rejection threshold on the recog¬ 
nizer rather high, at the price of making the 
rejection of correctly spoken words much 
more likely. 

User experiences. To better understand 
how Xspeak would be used and what ef¬ 
fects it would have on users, we conducted 
a small pilot study. We wanted to know 
under what circumstances users would 
choose voice input for navigation, where 
they would encounter difficulties, and how 
Xspeak would affect their window use. By 
observing real users, we could learn what 
changes and enhancements were needed to 
improve Xspeak. We were also curious 
about how users would react to using the 
less-than-perfect speech-recognition sys¬ 
tem on a long-term basis. This last issue 
has not received much attention; most 
published studies of user reactions to 
speech-recognition systems have used the 
research technique of using a hidden 
human to simulate a perfect recognizer. 

Over a summer, four student program¬ 
mers in the speech group, as well as two of 
the authors, used Xspeak in their day-to- 
day programming tasks. With one excep¬ 
tion they were already familiar with the X 
Window System. After an entry interview, 
the users were trained to use Xspeak. We 
tracked usage over a two-month period via 
extensive automatic logging, periodic 
videotaping, and frequent short inter¬ 
views. We derived recognition accuracy 
rates, and we collected timing data for 
comparable mouse and speech actions over 
a small set of navigation tasks. (A detailed 
report on our methodology and results is 
available. 12 ) 

From our analysis of these empirical and 



Figure 2. Xspeak control and utility panels. 


observational data, we reached the follow¬ 
ing conclusions about our users’ experi¬ 
ences with Xspeak: 

• Recognition is not straightforward. 
Although we used Xspeak in relatively 
quiet offices, the microphone configura¬ 
tion resulted in recognition errors. 

Several steps were required to deal with 
these errors. First, we changed a number of 
the recognizer’s internal parameters. Sec¬ 
ond, we set a high recognition threshold. 
Third, we placed the microphone on a stand 
to one side of the monitor, carefully posi¬ 
tioned to point toward the user; a better 
solution might employ a microphone built 
into the keyboard or monitor bezel. Fourth, 
we provided our users with utility func¬ 
tions to retrain, recalibrate, and reset the 
speech recognizer; one user retrained 109 
individual words in 79 sessions. 

Despite these actions, low recognition 
accuracy rates remained a problem. They 
ranged from slightly less than 50 percent to 
more than 80 percent (measured for the six 
pilot users during a randomly selected 
session and confirmed in a follow-on study 
involving three of the six users). Poor rec¬ 
ognition accuracy was the greatest impedi¬ 
ment to acceptance of Xspeak. The users 
who persisted had some of the highest 
overall recognition rates but also devel¬ 
oped successful strategies to overcome 
errors. 

• Some programmers preferred using a 
faster workstation without Xspeak to using 


a slower workstation with the speech inter¬ 
face. This might have been exacerbated by 
relatively slow performance by the X 
Windows server for programmers devel¬ 
oping X Windows applications. In any 
case, a somewhat improved user interface 
is no substitute for a faster processor. 

• For simple change-of-focus tasks 
(moving the mouse from one exposed 
window to another exposed window), 
speech was not faster than the mouse. In 
fact, it was marginally slower. The speed 
advantage shifted toward speech if the 
destination window was partially or com¬ 
pletely hidden. Exposing such a window 
requires no additional time for a voice 
interface but does require several addi¬ 
tional mouse actions. 

• Navigation in a window system can be 
handled with speech input. Users were 
able to move among and restack windows 
with ease. They learned the interface 
quickly and needed little tutoring to use the 
basic functions, although the control panel 
required more training. 

• Some users were not helped much by 
the voice interface. One of them, a very 
experienced window-system user, had al¬ 
ready developed techniques for coping 
with many windows (by using many 
icons). Another spent much of his time 
thinking at the keyboard and had few inter¬ 
actions with his windows; with this work 
style, transitioning among windows may 
be less critical. 


August 1990 


53 






















Verbs 

create 

Start an application, thus creating its windows. 

recall 

Reposition a window to the top of the window stack. 

hide 

Reposition a window to the bottom of the window stack. 

return 

Reposition a window to its previous position in the window 
stack. 

configure 

Move or resize a window. 

place 

Move the mouse to a specified position or named window 
without restacking. 

if-elseif-endif 

Conditionally execute a block of instructions. 

wait-on 

Stop execution until some condition is achieved or a timeout 
occurs. 

send 

Send a specified X Windows event to the named application 
window. 

string 

Send a series of keyboard events to the named application 
window. 

activate 

Activate a recognizer subtemplate. 

name 

Rename a window from a specified set of names. 

Conditions 

process 

Determine whether the named process is executing. 

iconified 

Determine whether the named window is iconified. 

map 

Determine whether the named window is on the screen. 

xevent 

Determine whether the specified X Windows event has been 
sent to a named window (used for handshaking with the server). 

timer 

Determine whether a specified time has elapsed. 


Figure 3. G-XL language. 


• Toward the end of the observation 
period, we noticed that the users most 
inclined to use voice increased the number 
of overlapping windows or the degree of 
overlap. 

• We found the use of voice in naviga¬ 
tion an incomplete substitute for the 
mouse. Our users did not rely on the speech 
interface to the exclusion of the mouse. 
They still had to use the pointer to interact 
with the direct-manipulation interfaces 
within applications. Having a hand already 
on the mouse accounted for 59 percent of 
the times users navigated with the mouse 
rather than with Xspeak. Some users found 
it awkward to use both interfaces simulta¬ 
neously. Others wanted to use Xspeak to 
handle direct-manipulation buttons or to 
start programs. 

Xspeak II 

User interfaces require iterative design 
cycles. Hence, a key goal of our Xspeak 
prototype was to learn what facilities 
would be useful in a speech interface. 
After considering Xspeak’s usage, users’ 


requests, and our improved understanding 
of the possibilities of a speech interface, 
we redesigned Xspeak to fix bugs, correct 
mistakes, and, most importantly, add 
features. 

We made two major changes. First, since 
context-dependent recognition improves 
recognition rates, we added the ability to 
create subtemplates. 

Second, Xspeak II includes a special¬ 
ized language, G-XL, to facilitate general- 
purpose handling of the window system. 
Where Xspeak was limited in its use of the 
pointer device, Xspeak II allows greater 
flexibility in the speech interface. Users 
can employ direct manipulation using 
voice, interacting with an application in 
addition to simply selecting it. 

Application-sensitive recognition. 

Increasing Xspeak’s scope would require a 
potentially much larger vocabulary . But a 
larger vocabulary is apt to introduce more 
recognition errors because more words 
could be confusable. This standard speech- 
recognition trade-off was critical in 
Xspeak; its recognition accuracy was al¬ 
ready barely acceptable. 


To minimize the impact of this trade-off, 
voice-input applications commonly break 
the vocabulary into subsets. If the dialogue 
can be structured so that the branching 
factor remains small, then the number of 
words active at any point can be mini¬ 
mized. For Xspeak II, we chose to create 
vocabulary subsets according to applica¬ 
tions. Grouping and enabling the templates 
lets Xspeak II switch among applications 
as they are invoked. Xspeak II also main¬ 
tains cooperability with the mouse; when¬ 
ever the mouse is used to enter a window, 
the corresponding vocabulary subset is 
enabled. 

Xspeak II language, G-XL. The origi¬ 
nal Xspeak was limited in its range of 
operations. For instance, users could not 
use voice to control direct-manipulation 
objects such as scroll bars. Furthermore, 
there was no way to group functionality 
(such as having two windows pop to the 
top of the window stack), to conditionally 
invoke programs based on the user’s cur¬ 
rent environment, or to wait for a window 
to become exposed before proceeding. 

G-XL, Xspeak II’s language, addresses 
many of these limitations. It also meets 
three major requests of the pilot-study 
users: macro capability for all X Windows 
events, greater control over screen events 
and process sequencing, and direct ma¬ 
nipulation of objects. We have designed 
and are implementing G-XL to provide a 
flexible interface between speech and a 
window system. With G-XL, users can 
tailor their speech interface in a variety of 
ways. 

Figure 3 lists the G-XL language. The 
verbs create, recall, and hide provide the 
basic functionality of the original Xspeak. 
To provide needed flexibility, G-XL con¬ 
tains the if (condition)-elseif-endif con¬ 
struction and the wait-on (condition ) con¬ 
struction. The conditions include map, to 
test whether a window is present on the 
screen; process, to test whether a process 
exists; timer, to test for an elapsed time; 
iconified, to test whether the application 
has been iconified; and xevent, to check 
for most X Windows input events on a 
window. 

Figure 4 shows parts of a G-XL configu¬ 
ration file. The section beginning with 
emacs checks whether there is an Emacs 
editor on the screen. If there is, whether or 
not it is iconified, the full Emacs window is 
popped above any other window. If there is 
no Emacs window, one is created. The 
inner (/block shows how a user can control 
the shape and position of the Emacs win- 


54 


COMPUTER 










template general 


mail 


emacs 


if (!map emacs) 


if ([process dbx) 


create emacs -r\ 

' -geometry 80x50+10+50 

elseif 


create emacs -r\ 

r -geometry 80x50+500+50 

endif 


wait (xevent MapNotify (window emacs)) 

elseif 


recall emacs 


if (process dbx) 


configure emacs 80x50+500+50 

endif 


endif 


activate emacs 


messages 


end template 



User mouse and 
keyboard input 

f ") 

Ycnoaif 1 ->^XWindows^) 

Speech server 1->i 

Xspeak |-5^_serveL^ 


Artificial / / / 

X Windows y' / 
events / / 

/ / 

Application L/ / 

(e.g., Emacs) | / 

Window manager^ 

(e.g., twm) \ 


Figure 5. Interaction between processes in Xspeak II. 


dow. In this case, if the debugger, dbx, is 
already running, Emacs will appear in a 
smaller window placed further to the right, 
so as not to obscure the dbx window. The 
configure verb can resize or move an X 
window. 

Additionally, Xspeak users wanted to 
handle the direct-manipulation objects 
(widgets) that are part of the Xt toolkit and 
used extensively by X Windows applica¬ 
tions. To do this, G-XL allows placing the 
pointer within a specific window (to focus 
input) and sending an artificial input event. 
The place verb puts the pointer at a given 
(x,y) location relative to the current win¬ 
dow. The send verb takes most X Windows 
input event types and artificially sends 
them to the given window, usually the 
current window. Applications do not dis¬ 
tinguish between these artificial events and 
user input, as shown in Figure 5. The string 
verb represents a series of send keypress 
commands and provides for keyboard 
macros. 

G-XL can send keystrokes direct to an 
application, but it is more than a simple 
keyboard macro package. First, G-XL 
knows about a number of X Windows event 
types, including keystrokes, button 
presses, and those events returned from the 
server when, for example, a window is 
created or resized. Second, as described 
above, a number of its primitives are actu¬ 
ally functions that allow sequencing of 
operations in the multiprocess X Windows 
environment, where requests must be 
acted on by the server. For example, if a 
window is obscured by another window, 
the button-press event cannot be sent to the 
application until the window has been 
exposed. 

Three additional features round out 
G-XL. The activate verb activates a recog¬ 
nizer subtemplate. While the general tem¬ 
plate is always active, subtemplates are 
swapped in and out as required; this also 
provides local scoping. The return verb 
restores the window stack and the pointer 
location to their states before the current 
subtemplate was activated. Finally, the 
name verb allows greater flexibility in 
dynamically creating windows. On a 
create request, which sequentially could 
produce several windows with the same X 
title, the name verb renames those win¬ 
dows from a set of given names. 

G-XL configuration files are compiled 
by the user and may be specified on the 
Xspeak II command line. For example, a 
user might have several different configu¬ 
ration files corresponding to various win¬ 
dow managers. 


Figure 4. Sample G-XL template. 


W e intend to make Xspeak II 
available to a wider audience 
and to closely monitor their 
usage patterns and reactions. We also want 
to gather data to evaluate how adding 
a speech channel for navigation affects 
users’ number, placement, and use of 
windows. 


Even with the limited utility of our ini¬ 
tial prototype, it is clear that at least some 
users find a speech interface comfortable 
and beneficial. As we discover how to in¬ 
tegrate voice with window systems, we 
will progress towards a deeper understand¬ 
ing of the roles voice can play in desktop 
computing environments. ■ 


August 1990 


55 



















Acknowledgments 

Sanjay Manandhar did much of the original Xspeak programming to 
work out the basic interaction mechanisms. Gale Martin provided insight¬ 
ful early discussions. Wendy Mackay provided invaluable assistance with 
our use of video as an evaluation method. Ralph Swick helped us with the 
more arcane aspects of the X Window System. The reviewers made many 
pertinent comments and suggestions. This project was funded by the MIT 
X Consortium and Sun Microsystems. 

References 

1. H.C. Nusbaum et al., “Testing the Performance of Isolated-Utterance 
Speech Recognition Devices,” Proc. 1986 Conf., American Voice 
I/O Society, pp. 393-408. 

2. A.W. Biermann et al., “Natural Language with Discrete Speech as a 
Mode for Human-to-Machine Communication,” Comm. ACM , Vol. 
28, No. 6, 1985, pp. 628-636. 

3. G.L. Martin, “The Utility of Speech Input in User-Computer 
Interfaces,” Int’lJ. Man-Machine Studies, Vol. 30,1989, pp. 355-375. 

4. J.D. Gould, “How Experts Dictate,” J. Experimental Psychology. 
Human Perception and Performance, Vol. 4, No. 4, 1978, pp. 648- 
661. 

5. D.A. Allport, B. Antonis, and P. Reynolds, “On the Division of 
Attention: A Disproof of the Single-Channel Hypothesis,” Quarterly 
J. Experimental Psychology, Vol. 24, 1972, pp. 225-235. 

6. A. Treisman and A. Davies, “Divided Attention to Ear and Eye,” in 
Attention and Performance, Vol. IV, 1973, pp. 101-117. 

7. B.A. Myers, “Window Interfaces: A Taxonomy of Window Manager 
User Interfaces,” IEEE Computer Graphics and Applications, Vol. 8, 
No. 5, Sept. 1988, pp. 65-84. 


Confused About 
Software Engineering? 

If you have more CASE questions than 
answers, give us a call. With tools and 
services to support 

■ Structured analysis and design 

■ Real-time techniques 

■ Ada-specific development 

■ Object-oriented methods 

■ DoD-STD-2167 documentation 

we have straight answers to your CASE 
questions. Call us today at 213/541 - 
6414. Because it doesn't have to be a 
CASE of confusion. 


_ Griffin 

Software Systems Technologies, Inc. 
2420 Via Carrillp, Palos Verdes, CA 90274 


8. K.B. Gaylin, “How Are Windows Used? Some Notes on Creating an 
Empirically Based Windowing Benchmark Task,” Human Factors in 
Computer Systems — CHI 86 Conf Proc., ACM, 1986, pp. 96-101. 

9. S.J. Card, M. Pavel, and J.E. Farrell, “Window-Based Computer 
Dialogues,” Proc. Interact 84, First IFIP Conf. Human-Computer 
Interaction, IFIP, Geneva, Switzerland, 1984, pp. 239-243. 

10. S.A. Bly and J.K. Rosenberg, “A Comparison of Tiled and Overlap¬ 
ping Windows,” Human Factors in Computer Systems — CHI 86 
Conf. Proc., ACM, 1986, pp. 101-106. 

11. C. Schmandt and M. McKenna, “An Audio and Telephone Server for 
Multimedia Workstations,” Proc. Second IEEE Conf. Computer 
Workstations, Computer Society Press, Order No. 810,1988, pp. 150- 
i so 


12. C. Schmandt et al., “Observations on Using Speech Input for Window 
Navigation,” tech, report, MIT Media Lab, Cambridge, Mass., 1990. 



Chris Schmandt is a principal research scientist and the director of the 
Speech Research Group at MIT’s Media Laboratory. He has been at the 
Media Laboratory since its creation five years ago and spent the previous 
five years with its predecessor, the Architecture Machine Group. 
Schmandt’s research covers a broad range of conversational computer 
systems, with current emphasis on voice and telephone interactions with 
workstations. He holds BS and MS degrees from MIT. 



Mark S. Ackerman is a doctoral candidate in information technology at 
MIT. He works with the Coordination Technology Group at the Center for 
Coordination Science and with the Speech Research Group at the Media 
Laboratory. His current research interests include window systems, elec¬ 
tronic reference systems, and human-computer interaction. He received a 
BA in history from the University of Chicago and an MS in computer and 
information science from Ohio State University. 



Debby Hindus is a graduate student and research assistant in the Speech 
Research Group at MIT’s Media Laboratory. Her research interests are in 
incorporating voice input and output into everyday computer use. Hindus 
previously consulted on user interface and software development issues. 
She received a BS degree in computer science from the University of 
Michigan. 

The authors’ address is Media Laboratory, Massachusetts Institute of 
Technology, E15-327, 20 Ames St., Cambridge, MA 02139. 


56 


Reader Service Number 5 


COMPUTER 
















DASFAA’91: Call for Papers 


April 2 - 4, 1991 
Kogakuin University 
Shinjuku, Tokyo, Japan 


Second International Symposium on Database 
Systems for Advanced Applications 


DASFAA is an international symposium which especially 
focuses on advanced database applications and advanced 
database technologies. The First DASFAA was held in 
Seoul, Korea on April 10-12, 1989, with over 200 par¬ 
ticipants from various countries. The Second DASFAA 
will bring together researchers, developers and advanced 
users from academia, business and industry to exchange 
idea and information. Papers for implemented systems 
are strongly solicited. 

DASFAA’91 is sponsored by the Information Processing 
Society of Japan (IPSJ). 

DASFAA’91 is supported by Korea Information Science 
Society (KISS), Australian Computer Society, Canadian 
Information Processing Society and Singapore Computer 
Society. 

DASFAA’91 is produced in cooperation with IEEE Com¬ 
puter Society and ACM SIGMOD. 


Topics (not exclusive) 


• Advanced Database Applications: 

CIM Databases 

: Office Information Systems 

Medical Information Systems 
| Engineering Databases 
| Databases for CASE 
CAI Databases 

; Databases for Scientific Computing 
I Social/Governmental Information Systems 

• Advanced Database Technologies: 

\ Object-Oriented DBMS 

j Deductive DB/Knowledge Base Systems 
| Hypermedia/Hypertext 
Multimedia DBMS 
Visual Interface 
Database Interface Toolkits 
Database Programming Languages 
Database Design Tools 
! Query Processing Techniques 
Distributed Databases 
I Database Processors 

Information Resource Management 
Heterogeneous Database Systems 
; Parallel Database Processing 

• Implementations & Evaluations: 

IEEE COMPUTER SOCIETY 


Conference Committee 


Takeo Miura, IPSJ, Honorary Chair 
Yahiko Kambayashi, Kyushu U., General Chair 
Tsuneo Kurokawa, Kogakuin U., Organizing Chair 
Yoshifumi Masunaga, ULIS, Executive Chair 
Kenji Suzuki, NTT, Executive Vice-Chair 
Myung-Joon Kim, ETRI, Executive Vice-Chair 
Jung-Kook Hong, IBM Japan, Executive Vice-Chair 
Ryosuke Hotaka, Tsukuba U., Tutorial Chair 
Hajime Horiuchi, Hitachi, Tutorial Vice-Chair 


Program Committee 


Akifumi Makinouchi, Kyushu U., Program Chair 
Katsumi Tanaka, Kobe U., Program Vice-Chair 
Kyu-Young Whang, KAIST, Program Vice-Chair 


F.Bancilhon 

H-C.Kang 

S.Nishio 

E.Bertino 

T.Kato 

N.Ohbo 

C.C.Chang 

K.Kawagoe 

J-T.Park 

J.C.Freytag 

M.Kitsuregawa 

R.Sacks-Davis 

T.Furukawa 

Y.Kiyoki 

A. Sakamoto 

J-K.Hong 

T.Koyama 

K.Sato 

T.Harder 

J-Y.Lee 

Y.Tanaka 

H.Ikeda 

Y-J.Lee 

S.Uchinami 

Y.Izumida 

T.W.Ling 

Y.Udagawa 

T.Kameda 

S. Miranda 

T.Uemura 

K.Kanasaki 

S-C.Moon 

K.Yokota 


To Submit Papers 


Authors are invited to submit three copies of a full original 
paper (up to 5000 words) in English before September 15, 
1990 to: 

Program Committee Chair: 

Prof. Akifumi MAKINOUCHI 
Dept, of Computer Science and 
Communication Engineering 
Kyushu University 
Hakozaki 6-10-1, Fukuoka 812, Japan 
Tel: +81-92-641-1101 ext. 6055 
Fax: +81-92-641-1825 

E-mail: akifumi@vax88.csce.kyushu-u.ac.jp 
Authors will be notified of acceptance/rejection by December 
4, 1990. Final papers will be due by January 15, 1991. 


Further Information 


Requests for further information should be addressed to: 
Executive Chair: 

Prof. Yoshifumi MASUNAGA 
University of Library and 
Information Science 

1-2 Kasuga, Tsukuba, Ibaraki 305, Japan 
Tel: +81-298-52-0511 ext. 340 
Fax: +81-298-52-4326 
E-mail: masunaga@ulis.ac.jp 

THE INSTITUTE OF ELECTRICAL AND 
ELECTRONICS ENGINEERS, INC. 

















0 


CALL FOR PAPERS 
INTERNATIONAL WORKSHOP ON FORMAL METHODS IN VLSI DESIGN 
Puerto Rico, January 9 ■ 11, 1991 

ACM SIGDA, in cooperation with IFIP WG 10.2 - WG 10.5, IEEE TC VLSI 


Background 

There is increasing interest, both in academia and industry, in the 
application of formal methods to the design of integrated systems. 
Some of this interest has been motivated by the urgency of im¬ 
proving the reliability, testability and robustness of designs. The 
aim of this series of workshops is to bring together researchers 
interested in the application of formal techniques to the hardware 
design process. The emphasis of this year’s meeting is to provide 
an opportunity for synergistic interaction between researchers in 
"traditional" CAD and those interested in formal approaches to 
design. 


Focus 

Papers describing original work in all aspects of formal hardware 
design methods are invited. Topics include, but are not limited to: 

• formally based automated/interactive synthesis methods 

• formal hardware verification methods 

• models for timing specification and verification 

• high level Specification techniques (with well defined 
semantics) 

• hardware description languages 

• use of theorem provers for verification 

• design for verifiability 

• correctness preserving transformations 

• formal approaches to design/synthesis fqr testability 

• microprogram verification 

• practical experiences 

• formal models for design 


Prior Workshops 

This workshop is intended as a series in North America to comple¬ 
ment a corresponding series held in Europe. The latter series has 
been organized within the scope of IFIP WG 10.2 (System Descrip¬ 
tion and Design Tools) and IFIP WG 10.5 (Very Large Scale 
Integration); the latest of these workshops was held at Houthalen, 
Belgium in November 1989. 


Participation/ Registration 

Those interested in presenting their work should submit 8 copies 
of a paper (or an extended abstract) to the workshop chairman. If 
you would like to participate in the workshop, please submit 4 
copies of (1) an abstract (1-2 pages) summarizing your research 
projects, (2) a provocative position statement indicating directions 
in which you believe the field is (or should be) headed; and (3) 
(optionally) a reprint of your most relevant publication for the 
symposium. 

Attendance at the workshop will be restricted in order to promote 
increased interraction. Since considerable interest has been ex¬ 
pressed, preference will be given to participants who express their 
interest in participating early. Please include a phone! FAX number 
and an electronic mail address on the manuscript in addition to 
your regular mailing address. 


Authors’ Schedule 

• Deadline for submission of papers: August 15,1990. 

• Notification of acceptance: October 15,1990. 

• Camera-ready paper for circulation at the workshop: 
November 15,1990. 

Revised versions of selected papers may be considered for publi¬ 
cation in a special issue of an archival journal. 


Program Committee 

L. Berman (IBM), D. Borrione (IMAG), R. Bryant (CMU), 

R. Camposano (IBM), S.K. Chin (Syracuse), L. Claesen 
(IMEC, European co-chair), S. Devadas (MIT), D. Dill (Stanford), 
H. Eveking (Darmstadt), M. Fourman (Edinburgh), W. Hunt 
(Computational Logic Inc.), K. Keutzer (AT&T), G. Milne 
(Strathclyde), P. Prinetto (Torino), A. Sangiovanni-Vincentelli 
(Berkeley), P.A. Subrahmanyam (AT&T, Workshop Chair). 


Mail Submissions to: 

P.A. Subrahmanyam 
Rm 4E-530' 

AT&T Bell Labs 

Holmdel, NJ. 07733 

Phone: (201)-949-5812 

e-mail: subra@vax 135.att.com 

Fax: 201 -949-3697 (also 201 -949-9118) 








Talk and Draw: 
Bundling Speech 
and Graphics 


Mark W. Salisbury, Joseph H. Hendrickson, 
Terence L. Lammers, Caroline Fu, and Scott A. Moody 
The Boeing Company 


I n the effort to improve communica¬ 
tions between humans and computers, 
the obvious (and probably unattain¬ 
able) standard of comparison is natural, 
face-to-face human dialogue. Most re¬ 
search trying to approximate this form of 
communication in a human-computer in¬ 
terface has focused on natural language 
processing and speech perception. But 
human communication is multidimen¬ 
sional, and a conversation includes more 
than just spoken words. 

As a dramatic illustration of the nonver¬ 
bal aspects of human communication, 
consider the difficulty in demonstrating the 
presence of aphasia. Aphasia is a neuro¬ 
logical condition which, without affecting 
intelligence, renders one incapable of 
understanding spoken words. Neverthe¬ 
less, aphasics often show a surprising 
ability to understand most of what is 
said to them, leading neurologist Oliver 
Sacks to observe: “[Sjpeech—natural 
speech—does not consist of words 
alone. ... To demonstrate their aphasia, 
one had ... to speak and behave unnatu¬ 
rally, to remove all the extraverbal cues— 
tone of voice, intonation, suggestive em¬ 
phasis or inflection, as well as all visual 
clues (one’s expressions, one’s gestures, 
one’s. . . posture).” 1 

August 1990 



Responding to 
simultaneous spoken 
and graphical input, a 
computer interface in 
development for the 
AWACS defense 
system increases 
operator effectiveness 
in directing military 
resources. 


Although a computer’s understanding 
of speech might benefit from information 
conveyed by such nonverbal means as 
body language, a computer cannot easily 
collect and process this information. 
Computer graphics, however, is one area 
that has well-developed data collection 
and display capabilities. Although not 

0018-9162/90/0800-0059$01.00 © 1990 IEEE 


included in every conversation, graphics is 
a natural part of human communication. 
For example, people often communicate 
an idea with a spatial component, such as a 
house layout, directions to a place, or a 
football play, by sketching a diagram as 
they speak. This combination of simulta¬ 
neous verbal and graphical data, where 
each is used to complete or disambiguate 
the other, we have termed “talk and draw.” 
In this article we describe an application 
that embodies some of the characteristics 
of the talk-and-draw concept. 

Research in human perception supports 
the intuitive appeal of the talk-and-draw 
communication paradigm. The work of 
Rohr 2 demonstrates that some concepts are 
inherently graphical while others are in¬ 
herently verbal. This indicates that, in any 
but the simplest applications, a composite 
of graphical and verbal I/O techniques is 
desirable. Furthermore, additional re¬ 
search has shown that concepts in a wide 
range of fields are generally understood 
when presented in a mixed verbal and 
graphical mode. 3 Consider an application 
displaying a map. The user might say “En¬ 
large” and simultaneously use a mouse to 
indicate a region on the screen. The com¬ 
mand to create an enlarged view of the 
region is given verbally while the indica- 

59 












tion of what constitutes the region is given 
graphically. 

Earlier work at Boeing in voice-con¬ 
trolled computer applications includes a 
robotic vocational workstation for the 
physically disabled professional. 4 
Through voice commands and a specially 
designed robotic arm, users can retrieve 
documents from a printer, pick up books, 
and perform other manipulative tasks. A 
voice-operable telephone management 
system allows users to receive telephone 
calls, record notes, record incoming mes¬ 
sages, record partial or complete conversa¬ 
tions, create phone number indexes and 
directories, and access on-line databases 
and bulletin boards. The workstation can 
be connected to various network systems, 
allowing users to access information from 
remote computer sites by voice. Users 
activate and shut down their workstations 


by moving their chairs to break a light 
beam underneath their desks. 

Another project, more closely related to 
the talk-and-draw concept, is the “put-that- 
there” system, developed at the Massachu¬ 
setts Institute of Technology. 5 In this ap¬ 
plication the operator was seated in a room 
dominated by a large screen displaying 
geometric forms such as circles, squares, 
and triangles. Users could manipulate 
these figures by creating, deleting, nam¬ 
ing, and moving them. They could also 
change their size and color. User com¬ 
mands were in the form of spoken com¬ 
mands and x-y coordinates indicated by a 
magnetic pointing device. A typical com¬ 
mand was “Put the green circle there,” with 
the pointing device indicating what was 
meant by “there”; or “Move that above the 
green circle,” with the pointing device 
indicating the object to be moved. 


Background and 
motivation 

The development of a workstation with 
integrated graphical and voice I/O is one 
aspect of an effort by Boeing’s Aerospace 
and Electronics Division to improve the 
man-machine interface (MMI) of the Air¬ 
borne Warning and Control System 
(AWACS). Such improvements are neces¬ 
sary because of increased demands placed 
on AW ACS operators by the operational 
and threat environments. 

AWACS is an integrated command, 
control, communications, and intelligence 
(C 3 I) system with advanced radar surveil¬ 
lance and data processing capabilities. The 
AWACS aircraft is a militarized version of 
the Boeing 707, with a rotating disk that 
houses the radar and IFF (identification— 
friend or foe?) antennas affixed to the top. 
Inside the aircraft the surveillance infor¬ 
mation is processed by a mainframe com¬ 
puter and distributed to 14 operators at 
multipurpose consoles. AWACS supports 
on-board personnel in identifying and 
tracking airborne and surface targets for 
air traffic control, early warning of an 
enemy threat, and direction of intercepters 
to their targets. 

As shown in Figure 1, the current 
AWACS workstation (that is, without the 
integrated voice-graphical interface de¬ 
scribed in this article) consists of a console 
with a vertically oriented, 19-inch diago¬ 
nal, high-resolution CRT. Two-thirds of 
the display area is dedicated to a graphical 
situation display. The other third is for 
alphanumeric information. Console con¬ 
trol and data entry require 140 momentary 
contact switches and 63 multiposition 
switches. Cursor control is accomplished 
with a trackball. The console also contains 
an audio panel to control radio and inter¬ 
com communications. 

We are using incremental prototype 
development to improve the MMI for 
AWACS. To begin, we analyzed operator 
tasks to identify functions that would be 
better performed with voice I/O. On the 
basis of this analysis, we developed a 70- 
word vocabulary and corresponding gram¬ 
mar for an ITT VRS-1280 speech recog¬ 
nizer. The ITT unit is a word-based, con¬ 
tinuous, speaker-dependent recognizer 
with a 500-word maximum vocabulary. 
The vocabulary we developed makes up 
several spoken commands created from a 
subset of phrases taken from operators’ 
normal conversation. A representative list 
of these commands is shown in Table 1. 


60 


COMPUTER 















Table 1. Partial list of input commands for prototype AWACS interface. 


Voice Input 

Operational Function 

Modify sector 2, fighter threat 

Sensor suite control 

Create sector 4, missile threat 

Sensor suite control 

Tactical, rifle 5 

Display commit recommendations and 
options for fighter flight 

Broadcast, bullseye 3 

Display all targets’ bearing and range from 
common reference point 

Rifle one flight, check fuel 

Eavesdrop on request for fuel status from 
pilots and prepare database for input 

Rifle one, 8,000 

Input fuel status 

Rifle three, bingo 

Input fuel status 

Display hawk belt 

Database query and display control of 
friendly surface-to-air missiles 


We also developed responses for a DEC 
Talker speech synthesizer. These are typi¬ 
cally short. Some responses provide feed¬ 
back to inform the operator that the input 
was understood (or not understood); others 
alert the operator to alarming situations. 

Our work extended earlier work at Boe¬ 
ing that had demonstrated the promise of 
voice I/O technology as an MMI improve¬ 
ment for AWACS. 6 Using the commands 
and responses we developed, our proto¬ 
type increased operator effectiveness for 
several functions, including fuel status 
updates, fighter commitment, and tactical 
and broadcasting activities. We found that 
a short, spoken phrase such as “Display 
sensors” was a natural and intuitive way to 
request display data to be overlaid on the 
background map. Since it is a short and 
straightforward request, it was spoken 
clearly and was consistently recognized. 
Not having to repeat the phrase contributed 
greatly to user acceptance. As a result of 
this finding, we expanded the grammar and 
vocabulary so that a wide variety of over¬ 
lays could be requested through voice 
input. 

In general, long, continuous input 
phrases caused more errors by the speech 
recognizer and were not as well accepted 
by users. As phrases became longer, they 
more frequently contained pauses and 
mispronunciations, resulting in lower rec¬ 
ognition accuracy. For example, a long 
database query might be “Show track 
number and characteristics of tracks with 
status level three.” In this case, recognition 
was improved by a shorter phrase, “Show 
level three tracks,” which generated a table 
with predetermined attributes (track num¬ 
ber and characteristics) for tracks with 
three as a value for status. 

Shorter phrases were also better for syn¬ 
thesized voice responses; longer phrases 
seemed to distract busy users. Conse¬ 
quently, we shortened longer phrases and 
dropped responses that seemed unneces¬ 
sary or redundant. 

In addition to increasing recognition 
accuracy, shorter command phrases were 
more easily integrated into workstation 
operational procedures. For example, us¬ 
ers can easily perform other activities 
such as making a related menu selection 
while speaking a short phrase. Therefore, 
we added the capacity to process simul¬ 
taneous graphical and voice I/O to the 
prototype. 

The simultaneous processing of graph¬ 
ical and voice input gives a talk-and-point 
feel to the prototype interface. As shown in 
Figure 2, the interface provides a dynamic 


situation display. Operators do not create 
freehand graphical objects but interact 
with the display to make tactical assess¬ 
ments and to direct military resources. 
They input requests by speaking com¬ 


mands into a head-mounted microphone 
and simultaneously selecting graphical 
objects with a mouse. The system outputs 
results by simultaneously displaying 
graphics and synthesizing speech. The 



Figure 2. Prototype voice-graphical AWACS interface. 


August 1990 


61 









Figure 3. Processing simultaneous voice and graphical input. 


following section focuses on the input side 
of this combined voice-graphical interac¬ 
tive technique. 

Implementation 

Figure 3 shows the system architecture 
for integrating voice and graphical input in 
the AWACS interface prototype. Input 
processing begins with the user’s speaking 
a phrase. The speech recognizer sends an 
ASCII text string representing the phrase 
to Gerbal (graphical/verbal dialogue man¬ 
ager). After a short waiting period, Gerbal 
sends a request to the blackboard manager 
for the graphical events that have occurred 
since the previous speech input. This wait¬ 
ing period, roughly one-half second, al¬ 
lows collection of graphical events that 
occurred after the speech utterance was 
completed. We are adjusting the length of 
the waiting period as we gain more experi¬ 
ence with the interface. 

Graphical events are collected by a proc¬ 
ess called the graphics handler, which also 
performs other tasks such as displaying the 
maps and icons that comprise situation 
displays in the AWACS application. An 
important function of the graphics handler 
is to maintain a link between graphical 
objects and any associated information in 
a database. The graphics handler asyn¬ 
chronously collects graphical events and 
passes them to the blackboard manager 
for storage. 

Most of the graphical events are selec¬ 
tions, made with the cursor control device 
(or “click”), of a graphical object that has 


data associated with it. For example, click¬ 
ing on an airplane icon would return “air¬ 
plane” as the class of object, with values 
for location, speed, and fuel status. Graph¬ 
ical objects are created by means of stan¬ 
dard graphics primitives such as lines, 
circles, and polygons. They are then linked 
with associated information, including 
their class and screen location, which is 
kept in a relational database. The graphics 
handler determines which graphical object 
was selected by the user by searching this 
database to find the object closest to the 
click within a specified tolerance. 

In addition, two application-specific 
graphical events have been developed for 
the AWACS interface. One is the “lat- 
long” selection, a default event that takes 
place when a click on the background map 
occurs without being near (within toler¬ 
ance of) a graphical object. The values 
returned for this event are latitude and 
longitude. The other event is the “range¬ 
bearing” selection. The operator clicks on 
the screen and then drags the cursor before 
releasing the button to indicate the range 
and bearing desired. The bearing, in de¬ 
grees, and the range, in nautical miles and 
kilometers, are returned. 

When it receives a graphical event, the 
blackboard manager performs a call to the 
operating system to get the current system 
time. It then “posts” the event along with 
its arrival time and type on the blackboard. 
In response to requests by Gerbal, the 
blackboard manager returns the events in a 
list beginning with the most recent event 
and ending with the oldest. 

Receiving the graphical event list, Ger¬ 


bal gets the current system time, then 
begins with the most recent graphical 
event. For each event, Gerbal compares the 
current time with the time that the event 
occurred. If the graphical event is recent 
(within a few seconds), it is stored in an 
internal data structure for processing. If it 
is old (over a minute), it is discarded. This 
window of time for graphical events is also 
undergoing adjustment as we accumulate 
more experience with the interface. The 
idea is to concentrate processing on recent 
events that are most likely associated with 
the speech input and to disregard older data 
that are not related. 

Gerbal begins processing graphical 
events by consulting a knowledge base to 
determine the user’s operational context. 
In the AWACS application the operator is 
viewing a display and responding to situ¬ 
ations as they arise. Gerbal predicts likely 
operator actions on the basis of what has 
happened recently. For example, if the 
operator is performing sensor suite control 
and management activities, Gerbal pro¬ 
cesses the events in a surveillance context. 
However, if the operator is performing 
activities related to the assignment of air¬ 
craft to targets, Gerbal processes events in 
a weapons control context. 

Gerbal then constructs a graphical con¬ 
text—all the possible voice-graphical ac¬ 
tions that can be identified from the graph¬ 
ical input data. Processing begins with the 
graphics side so that results from graphical 
input analysis can be used to disambiguate 
and complement the spoken input. (In the 
following section we describe how graph¬ 
ical input can be used to increase speech 
recognition accuracy.) For example, if two 
lat-long graphical events are found within 
a short time of each other, then a possible 
“create sector” action within the surveil¬ 
lance context is present. Note that pos¬ 
sible interpretations of the graphical 
events within other operational contexts 
are not considered. (A future extension 
will relax this restriction, so that after a 
failure to identify a voice-graphical action 
within the expected context, an inter¬ 
pretation within another context could 
be considered.) 

Within the operational and graphical 
contexts, Gerbal parses the speech input 
and combines it with the graphical input. 
For example, an operator selects two coor¬ 
dinates on the screen while saying “Create 
sector, fighter threat” in the context of 
surveillance. The phrase is combined with 
information about the two coordinates 
(azimuth and sector number) to complete 
the semantics of the operator’s input: 


62 


COMPUTER 





























“Create sector four, fighter threat from 45 
to 180 degrees,” where azimuth has been 
calculated from the coordinates. In another 
example, from the weapons control area, 
the operator selects two airplane symbols 
on the screen while speaking the phrase 
“Commit.” The phrase is combined with 
information associated with the symbols in 
a weapons control context to complete the 
operator’s input: “Commit rifle flight on 
A001,” where “rifle flight” is the intercep- 
ter and “A001” is the target. 

If Gerbal fails to identify a voice-graph¬ 
ical action, then it first makes an effort to 
ensure that it has gathered all relevant 
graphical input. It makes another request 
of the blackboard manager to see if any 
graphical events that will complete a 
voice-graphical action have occurred since 
the last request. (This time period, roughly 
one-half second, is also being evaluated 
and is under revision.) If additional graph¬ 
ical input is found, Gerbal tries again to 
identify a voice-graphical action by adding 
to its graphical context and parsing the 
speech input in this new context (and in the 
original application context). 

If Gerbal still fails to identify a voice- 
graphical action, it makes an attempt to 
recover. If graphical input is missing or 
inappropriate for a recognized spoken 
command, Gerbal tries to get the required 
graphical input from the user. For example, 
if the operator has said “Create sector” 
without clicking the two coordinates, Ger¬ 
bal responds with the synthesized phrase 
“Select two coordinates to create sector.” 

Gerbal uses a similar recovery method if 
graphical input is present but a spoken 
phrase is missing or not recognized. Re¬ 
versing the previous example, if an opera¬ 
tor, in the context of surveillance, selects 
two coordinates on the screen without 
giving usable spoken input, then Gerbal 
tries to confirm that the operator wanted to 
say “Create sector” by eliciting a response 
from the operator. We are trying both 
graphical and verbal operator responses 
for this purpose. 

In many applications it might be reason¬ 
able for the system to make a guess about 
the user’s intentions when input is missing. 
However, our experience with the proto¬ 
type indicates that this tactic should be 
used sparingly in a fielded AW ACS sys¬ 
tem, in which operators are making assess¬ 
ments and directing resources in poten¬ 
tially life-threatening situations. 

Gerbal is implemented in Quintus 
Prolog and, like the rest of the prototype 
software, runs on Sun workstations. Com¬ 
munications between I/O components and 


other processes are effected through the 
remote procedure call facility on the Sun 
Network File System. 

Development issues 

Preliminary testing of the system has 
been completed. A review panel of four in- 
house analysts, experts in the area of 
AW ACS MMI, evaluated system perform¬ 
ance. The evaluation suite consisted of 36 
individual commands. No recognition er¬ 
rors were observed, and the review panel 
recommended that development continue. 
Based on this recommendation, a formal 
test plan was drawn up, including addi¬ 
tional nonvoice AW ACS enhancements. 
The testing will consist of an experimental 
design that tests the voice and graphical 
components separately and in combina¬ 
tion. Formal testing is to be completed by 
the end of this year. Meanwhile, user ac¬ 
ceptance of the prototype seems high. 

Much of our near-term effort for the 
AW ACS application will be aimed at 
achieving voice I/O robustness in the op¬ 
erational environment. By robustness we 
mean few I/O errors (such as voice input 
not recognized, voice output not heard by 
the operator) in a setting where there is 
high background noise and the operator is 
under emotional stress. 

In general, speech recognition poses a 
larger problem than speech generation in 
achieving robustness in the AW ACS envi¬ 
ronment. There are two types of recogni¬ 
tion errors: interpreting background noise 
as its closest legal vocabulary item and 
failure to match a legal item correctly. 

To deal with the erroneous recognition 
of background noise, one can either elimi¬ 
nate the extraneous input or teach the re¬ 
cognizer how to handle it. Most speech 
recognizers require the user to hold a hand 
button down while speaking, thereby re¬ 
ducing background noise processing. Ini¬ 
tially, we felt this was unacceptable and 
investigated providing default sounds for 
the recognizer to match against back¬ 
ground noise. The ITT recognizer can be 
configured to not produce any output for 
such matches. Although we still think this 
is a valuable approach, we discovered that 
it isn’t necessary in our application. 
AW ACS operators use a push-to-talk foot 
switch in radio communication. Experi¬ 
enced operators have no difficulty adapt¬ 
ing to the use of this switch for talking with 
the computer as well as for talking on the 
radio. 

The second problem, mismatching legal 


vocabulary, is directly related to the size of 
the search space. In the discrete speech 
recognition process, a model of a spoken 
word or phrase is compared against a 
model of each of the possible words or 
phrases that can be recognized. Continu¬ 
ous recognition involves a much larger 
search space for identifying the spoken 
input because combinations of speech 
units (words, syllables, or phonemes) must 
be compared to the spoken input. High- 
level knowledge can be used to restrict 
processing to only the combinations most 
likely to have been spoken at the time. 
Since fewer possibilities are considered, 
the chances of making errors are reduced, 
consequently increasing recognition accu¬ 
racy. (For a description of the use of high- 
level knowledge to improve speech recog¬ 
nition, see the article by Young et al. 7 ) 

We are planning to exploit two kinds of 
high-level knowledge to improve recog¬ 
nizer performance. One kind is contextual 
knowledge of the application; the other is 
knowledge of the user’s graphical input. 
The prototype already uses both types of 
knowledge to augment spoken input. As an 
example of how we plan to use this knowl¬ 
edge, consider an operator making selec¬ 
tions on the screen while speaking a phrase. 
Gerbal will use contextual and graphical 
information to identify the phrases most 
likely to have been spoken by the operator 
within the operational and graphical con¬ 
texts. Gerbal will pass these possible 
phrases to the recognizer (along the dotted 
arrow in Figure 3), which will use them to 
restrict the search and correctly recognize 
the spoken phrase, before passing the tex¬ 
tual representation to Gerbal for further 
processing. 

Future directions 

Because its graphics input is limited to 
icon selection and bearing and range indi¬ 
cation, the AWACS application is, as we 
said earlier, more a talk-and-point system 
than a talk-and-draw system. Our current 
research focuses on applying speech and 
graphics bundling to new areas to promote 
the evolution of this technology to a closer 
approximation of the talk-and-draw 
model. 

An example is our continuing work with 
the On-Line Planning (OLP) project. OLP 
is an on-line database of plans for Boeing’s 
parts-manufacturing processes. Our aim in 
the project is to apply bundling to create an 
environment to facilitate the planning 
process. 


August 1990 


63 




One of the features of bundling as ap¬ 
plied to the OLP project is the dichotomy 
between the production phase and the per¬ 
formance phase. In the production phase 
the planner, in determining the manufac¬ 
turing processes for a part, creates the 
speech and graphics bundle. A problem is 
inherent in creating the bundle: The plan¬ 
ner must be able to create bundles quickly 
and efficiently to meet schedules. Devel¬ 
oping and assembling material are often 
difficult and time-consuming tasks. To 
simplify them, a production environment 
with a spoken language interface and a tool 
set should be available to the planner. 

Among the tools under consideration at 
this time is a concept classifier, based on 
the work of Rohr 3 and Jackendoff, 8 to iden¬ 
tify concepts best expressed graphically 
and those best expressed verbally. Also 
under consideration are feature recogniz¬ 
ers, discussed by Requicha and Vanden- 
brande. 9 In this case the problem is not to 
develop a recognizer but to develop effi¬ 
cient and intuitive means of linking the 
bundle to the representation created by the 
recognizer. 

In the performance phase the OLP user 
plays back the speech and graphics bundle 
and interrogates it, stopping the playback 
and posing a question such as “How does 
this bolt insert into the assembly?” The 
speech recognition and natural language 
components understand the question, but 
answering it depends to a large extent on 
translating between graphical objects and 
language. The bundling system parses the 
question and understands it with a lexicon; 
the problem is to translate from a verbal to 
a graphical representation. Thus the per¬ 
formance phase is more difficult than the 
production phase; the former deals not only 
with language understanding but also with 
graphics understanding and with the trans¬ 
lation between language and graphical 
representations. Our research is focusing 
on scene descriptions, based on work by 
Novak, 10 and on deep case representations, 
discussed by Fillmore, 11 as methods to 
perform the translation. 


W e are encouraged by our 
AWACS interface develop¬ 
ment effort and our prelimi¬ 
nary work on the OLP application. This ex¬ 
perience convinces us that bundling speech 
and graphics can improve communications 
between humans and computers. Future 
improvements, however, depend on devel¬ 
oping methods to transform bundled 
speech and graphics input into an internal 


representation. Additionally, generators 
are needed that will use this representation 
to output an appropriate composite of 
speech and graphics. Success with these 
representations and transformational is¬ 
sues will lead to computer systems that 
take full advantage of the talk-and-draw 
style of communications. ■ 


Acknowledgment 

Many of the MMI components mentioned in 
this article, including the graphics handler and 
blackboard manager, were the result of work 
done by the Rapid (rapid prototype interface 
design) prototyping group of the Man/Machine 
Systems Technology Organization of Boeing’s 
Aerospace and Electronics Division. (For an 
overview of the Rapid workstation, see Moody, 
Hudson, and Salisbury.' 2 ) 


References 


1. O. Sacks, “The President’s Speech,” in The 
Man Who Mistook His Wife for a Hat, 
Summit Books, New York, 1985, pp. 76-80. 

2. G. Rohr, “Using Visual Concepts,” in Vis¬ 
ual Languages, S. Chang, T. Ichikawa, and 
P. Ligomenides, eds., Plenum Press, New 
York, 1986, pp. 325-348. 

3. S. Guastello, M. Traut, and G. Korienek, 
“Verbal Versus Pictorial Representations of 
Objects in a Human-Computer Interface," 
Int’l J. Man-Machine Studies, July 1989, 
Vol. 31, No. l,pp. 99-120. 

4. C. Fu, “An Independent Workstation for a 
Quadriplegic,” in International Exchange 
of Experts and Information in Rehabilita¬ 
tion, Monograph #37, Richard Foulds, ed., 
World Rehabilitation Fund, New York, 
1986, pp. 42-44. 

5. R. Bolt, ‘“Put That There’: Voice and Ges¬ 
ture at the Graphics Interface,” tech, report, 
MIT, Cambridge, Mass., 1980. 

6. C.D. Anderson, “Application of Speech 
Recognition and Touch-Screen Input Sys¬ 
tems to Airborne C 3 Operations—Results of 
Mission Simulator Evaluation,” Document 
No. 10180-28809-1, The Boeing Co., Se¬ 
attle, Wash., 1985. 

7. S.R. Young et al., “High-Level Knowledge 
Sources in Usable Speech Recognition 
Systems,” Comm. ACM, Vol. 32, Feb. 1989, 
pp. 183-194. 

8. R. Jackendoff, Semantics and Cognition, 
MIT Press, Cambridge, Mass., 1983. 

9. A. Requicha and J. Vandenbrande, “Auto¬ 
mated Systems for Process Planning and 
Part Programming,” in Artificial Intelli¬ 
gence: Implications for CIM, A. Kusiak, 
ed., Springer-Verlag, Berlin, 1988, pp. 301- 


10. H.-J. Novak, “Strategies for Generating 
Coherent Descriptions of Object Movements 
in Street Scenes,” in Natural Language Gen¬ 
eration, G. Kempen, ed., Martinus Nijhoff, 
Dordrecht, Netherlands, 1987, pp. 117-132. 

11. C. Fillmore, “The Case for Case,” in Univer- 
sals in Linguistic Theory, E. Bach and R. 
Harms, eds., Holt, Rinehart & Winston, New 
York, 1968, pp. 1-90. 

12. S.A. Moody, T.H. Hudson, and M.W. Salis¬ 
bury, “Rapid: A Prototyping Environment 
for Battle Management Information 
Systems,” Proc. Third Annual User-System 
Interface Conf, Austin, Tex., 1988, pp. 
109-116. 



Mark W. Salisbury is a senior systems analyst 
in the Speech Perception and Language Com¬ 
prehension Group in the Aerospace and Elec¬ 
tronics Division of the Boeing Company. 
Before joining this group, he worked for the 
division’s Man/Machine Systems Technology 
Organization. During the last five years, he has 
worked on a number of research and develop¬ 
ment efforts involving the design of human- 
computer interfaces for military and space 
applications. His current research interests in¬ 
clude human-computer interface design, voice 
I/O technology, and software engineering. 

Salisbury has a PhD in computer science 
education and an MS in computer and informa¬ 
tion science from the University of Oregon. He 
also holds an MAT in economics and a BS in 
secondary education from Western Oregon 
State College. 



Joseph H. Hendrickson is a senior software 
engineer in Boeing’s Aerospace and Electronics 
Division. He spent 10 years at the company 
designing and testing real-time, embedded soft¬ 
ware systems before joining the Speech Percep¬ 
tion and Natural Language Processing Group. 
His current research interests include software 
engineering, voice recognition, and natural lan¬ 
guage processing. 

Before joining Boeing in 1978, Hendrickson 
was a faculty member at the University of Puget 
Sound and the State University of New York, 
Binghamton. He received his PhD and MS in 
mathematics from Tulane University and his 
BA in mathematics from the University of Cali¬ 
fornia, Riverside. 


64 


COMPUTER 









Terence L. Lammers is a senior engineer in the 
Software Engineering Organization of Boeing ’ s 
Aerospace and Electronics Division. He is a 
principal investigator on a software engineering 
project to enhance software development meth¬ 
ods. The research reported in this article was 
done while the author was associated with the 
Speech Perception and Language Comprehen¬ 
sion Group. In the course of nearly five years 
with Boeing he has worked on projects includ¬ 
ing the development of database interfaces, the 
development of an intelligent diagnostic sys¬ 
tem, and the design of an environment for manu¬ 
facturing planning. His current research inter¬ 
ests are machine translation, speech recognition 
and natural language processing, the cognitive 
aspects of graphical and verbal reasoning, and 
phonetics and phonology. 

Lammers has a PhD in Slavic linguistics 
from Indiana University, a BS in computer 
science from the University of Montana, and 
an MS in computer science from Montana 
State University. 



Caroline Fu is currently the manager of the 
Speech Perception and Language Comprehen¬ 
sion Group at Boeing. She directs a speech 
technology and natural language research pro¬ 
gram, which integrates speech recognition and 
response with natural language processing for 
use in defense and industry. She conducted a 
Boeing-sponsored, three-year Voice Aids for 
the Handicapped program, which implemented 
a voice-controlled robotic workstation for the 
physically disabled professional. Her 18 years 
of applying computer technology to scientific 
and engineering problems include the develop¬ 
ment of voice recognition, distributed-process¬ 
ing, and man-machine interface systems for 
drug safety research, and R&D software for 
automobile safety and laser fusion. 

Fu received her MS degree in computer sci¬ 
ence and her triple-major BS degree in applied 
mathematics, electrical engineering, and phys¬ 
ics, both from the University of Wisconsin- 
Madison. 



Scott A. Moody is the lead software engineer 
directing research and development in the Boe¬ 
ing Rapid project, a collection of rapid-proto¬ 
typing tools for generic C 3 I graphics worksta¬ 
tions. His research interests include compiler 
design, distributed graphical prototypes, and 
design of application generators utilizing reus¬ 
able software technology. 

Moody received an MS and a BS in computer 
science from the University of Washington at 
Seattle. He is a member of the ACM and the 
IEEE Computer Society. 


Readers may write the authors at Boeing 
Aerospace and Electronics Division, PO Box 
3999, MS 82-58, Seattle, WA 98124-2499. 


TurboCASE™: the affordable Macintosh 
CASE tool with power to spare. 


New Version 2.0 

• Structured Analysis (with 
Real-Time extensions). 

• Data Modeling. 

• Object Oriented Analysis. 

"TurboCASE is the most usable 
CASE tool on the market. You can 
use it right away and be productive 
right away." says a TurboCASE user 
in the Feb. 20 issue of Mac WEEK. 

Price: $995.00 Demo: $15.00 



StructSoft, Inc. 5416156th Ave. SE, Bellevue, WA 98006. Tel: 206-644-9834 Fax: 206-644-7714 


Reader Service Number 6 


August 1990 


65 



















































Extending the Notion 
of a Window System 
to Audio 


Lester F. Ludwig, Bell Communications Research* 

Natalio Pincever, Massachusetts Institute of Technology Media Lab 
Michael Cohen, Northwestern University 


V isual window systems have be¬ 
come successful elements of hu¬ 
man-machine interfaces, allowing 
multiple applications to simultaneously 
use a common visual display resource. 
This capability is really the combination of 
two abilities: 

(1) The user can control the spatial 
organization of the multiple visual 
objects (windows) via a window 
manager utility. 

(2) The user can shift visual attention as 
needed among the various displayed 
objects. 

With audio’s increasing importance in 
computer applications (multimedia or oth¬ 
erwise), 1 ' 2 users will soon need similar 
presentation, management, and organiza¬ 
tional capabilities to avoid a confusing 
cacophony of multiple audio sources 
sounding at once. 

In our everyday lives, avoiding such 
confusion presents no real problem as our 
hearing system sorts out sound sources of 
differing timbre 3 and differing locations in 
space. 45 In office, home, and social set¬ 
tings we can shift attention among various 
audio sources. We perceive each source as 
coming from a specific location or direc¬ 
tion and as exhibiting other qualities, such 
as being muffled or echoed. 

In today’s PCs and workstations, sound 
sources are either switched on-off or mixed 

66 



Just as visual window 
systems let multiple 
applications share 
display resources, an 
audio window system 
could bring order to 
the cacophony of 
multiple simultaneous 
audio sources. 


with simple audio mixers. These tech¬ 
niques, however, do little to merge sound 
sources in the way our ears expect, that is, 
with differing binaural delay and fre¬ 
quency-response signatures. 4 - 5 Can we 
extend PC and workstation audio to use 
human ability to deal with multiple sound 
sources? 

The work summarized in this article 


* Lester Ludwig has since become principal scientist at 
Vicor, Inc. 

0018-9162/90/0800-0066S01.00 © 1990 IEEE 


examines how to take advantage of the 
ear’s valuable sorting and labeling poten¬ 
tial. By using audio signal processing to 
exploit the psychoacoustic properties of 
hearing, we can simultaneously present 
multiple audio sources in a way that lets a 
user shift attention among them. We can 
organize audio sources spatially in one, 
two, or three dimensions (that is, the 
sources are perceived to be at different 
locations in a line, plane, or three space), 
and we can introduce hierarchical levels of 
emphasis. By dynamically reconfiguring 
the signal processing pipelines via simple 
human interfaces, we can interactively 
alter the organization of multiple audio 
sources. From the user’s view, the result¬ 
ing functionality has a strong similarity to 
visual window systems; hence, we refer to 
it as an audio window system. The similar¬ 
ity to visual systems lets audio window 
systems complement or work in synergy 
with visual window systems. 

Envisioned use. A user constructing a 
multimedia document or presentation 
might need to manage several audio 
sources. Audio from various multimedia 
databases, editors, and message systems 
would have to be presented in an environ¬ 
ment that includes other application output 
and prompts, telephone calls, voice mail, 
and collaborative teleconferences that 
might need to access the editors and data- 

COMPUTER 















bases. A graphical user interface would 
feature a map of the sound sources’ virtual 
locations. The user could reposition a 
source by dragging its visual icon with a 
mouse, and the perceived source location 
would dynamically track the icon position. 
For sound-annotated documents, audio 
would seem to come from the same posi¬ 
tion as the corresponding visual window. 

Spatial, hierarchical, and exclusion 
management capabilities could also be 
used within an individual application just 
as child windows can be used within a 
parent window in visual window systems. 
Therefore, audio windowing can be used 
both within a single application and as part 
of the audio management of all applica¬ 
tions active at the user terminal. 

For example, audio windowing can be 
used to implement spatial data manage¬ 
ment systems 6 in which users browse 
through a data “world” of relocatable ob¬ 
jects. Simulations and scientific visualiza¬ 
tions could use audio windows to hierar¬ 
chically and spatially organize sonic cues 
corresponding to activity on a factory floor 
or propagation of a crack through a solid. 
Messaging applications could use audio 
windowing to manage voice annotations 
to a multimedia message (or even annota¬ 
tions to forwarded voice mail) so that the 
annotations are spatially separated from 
the rest of the message’s audio. In enter¬ 
tainment applications, audio windowing 
can be used for special effects or to support 
theatrical sound systems such as Surround 
Sound 360 or THX. 

Although much technology has been 
applied to teleconferencing and computer- 
supported cooperative work, 7 little has 
been done to improve how audio is handled 
and presented in a teleconference. Audio 
windowing can help introduce new, natu¬ 
ral metaphors into electronic meetings, 
and hierarchies can help manage the audio 
for the main conference and side discus¬ 
sions. Spatial metaphors improve speaker 
recognition and reduce cacophonous ef¬ 
fects that occur when multiple speakers 
talk simultaneously. Teleconferences 
might also be presented as informal inter¬ 
action systems. As an extension of in¬ 
creasingly popular “chat lines,” we can 
imagine virtual gatherings of geographi¬ 
cally separated users who “circulate” 
around an electronic “room,” listening in 
or temporarily joining conversations and 
moving on as would conventioners or 
minglers at a cocktail party. Caucuses 
wishing to meet privately could electroni¬ 
cally adjourn to an adjacent private audio 
chamber. 


Relationship with graphics window 
servers and the Vox audio server. Our 

notion of audio windowing focuses on the 
user’s audio interface and is therefore 
analogous to the interface of a visual win¬ 
dow system. A visual window system in¬ 
terface uses a graphics window server and 
a window manager. Similarly, an audio 
window system must include audio pro¬ 
cessing for implementing the audio pres¬ 
entation and a means of controling this 
processing. Graphical window servers also 
give application programmers an interface 
to such functions as resource management, 
event notification, and other operations 
not directly involved in display. The Vox 
Audio Server 2 focuses on resource man¬ 
agement of audio peripherals such as mix¬ 
ers, tape recorders, and speech interfaces. 
However, it does not directly give the user 
an audio presentation interface. Vox could 
be used to manage and control audio pro¬ 
cessing allocations and configurations for 
an audio window system. The scope of an 
audio window system, therefore, partially 
overlaps the scope of Vox (see Figure 1). 

Our goals. Our work at Bellcore’s Inte¬ 
grated Media Architecture Laboratory 810 
project attempted to find solutions to the 
audio management problem that arises 
when several communications services use 
audio simultaneously. We sought to 

• investigate methods, costs, and effec¬ 
tiveness of managing several simulta¬ 
neous audio sources; 

• examine how to distribute audio win¬ 
dowing functions among user termi¬ 
nals, local networks, and public net¬ 
works; 

• consider terminal and network imple¬ 
mentations of audio window systems 
that can work together; and 

• explore how audio windowing can be 
used in stand-alone services (such as 
teleconferencing, spatial data manage¬ 
ment, 6 training, CAD, entertainment), 
some of which the network may pro¬ 
vide directly and others the network 
must indirectly support. 

We have not fully met any of these 
goals, but we can identify some promising 
techniques, observations, and findings 
from our work. 

Techniques 

We focus here on the signal processing 
methods used to create hierarchical and 
spatial distribution among “nearly arbi¬ 



Figure 1. The relationship of audio 
windowing and the Vox Audio Server. 


trary” (not pure sine wave*) audio sources. 
For more details, see Ludwig and Pin- 
cever. 10 

Hierarchies. The music, radio, and film 
industries have used special effects for 
years to make certain sound sources “stand 
out” or “stand back” from others. In par¬ 
ticular, the music industry commonly uses 
electronic processing to emphasize solo 
instruments or vocalists. We believed we 
could use such processing to give different 
degrees of emphasis to individual audio 
signals within a collection. These differen¬ 
tiated signals would be naturally perceived 
as being in a hierarchical relationship. 
There are several techniques and commer¬ 
cial products that accomplish this by incor¬ 
porating one or more of the following 
functions: 

• Self-Animation (massive frequency- 
dependent phase distortion) 

• Distortion (nonlinear wave-shaping, 
that is, amplitude distortion) 

• Thickening (a chorused “doubling” 
effect via pitch-shifted signals) 

• Peaking (linear band-emphasis filter¬ 
ing) 

• Distancing (reverberation and echo) 

• Muffling (linear low-pass filtering) 

• Thinning (linear high-pass filtering) 

A number of commercial products 
called exciters or imagers perform self¬ 
animation, both with and without distor¬ 
tion. Thickening (also called doubling) is 
available in devices called pitch shifters or 
harmonizers. Peaking can be realized by 
graphic, parametric, or shelving equaliz¬ 
ers. Distancing is performed by echo, digi- 


* In most of the techniques we have studied, pure sine 
waves act as singularities (that is, the effects break 
down) because artifacts in complex frequency spectra 
are key to producing effective cues for hearing. Most 
natural and machine-generated sounds are not pure 
sine waves, so we do not consider this a concern. 


August 1990 


67 









Figure 2. A spatial sound system based on the Crystal River Convolvatron. 


tal delay, and reverb devices. Muffling and 
thinning can be realized by dedicated low- 
pass and high-pass filters but are also avail¬ 
able on graphic, parametric, and shelving 
equalizers. Distortion-only devices (such 
as guitar “fuzz boxes”) are also available. 
We restricted our signal processing to 
exciters, pitch shifters, and equalizers 
based on our experience with the various 
options and the effects we desired. How¬ 
ever, we provide some background on the 
each of the functions below. 

Self-animation makes a source sound 
more lively by accentuating frequency 
variations in the source signal, much as 
stones in a shallow creek accentuate water 
turbulence to produce more eye-catching 
patterns. Distortion produces a strained — 
perhaps “excited” — sound, but it reduces 
intelligibility and increases listener fa¬ 
tigue. Thickening produces a “thicker” 


* An arbitrary audio signal segment can be pitch shifted 
via Doppler effects created when a signal buffer is read 
out at a different rate than it was loaded. Underflow and 
overflows naturally result, so pitch shifters employ 
time-staggered multiple buffers whose outputs are 
sequentially directed to the output port in some form of 
fixed-duration cycle. Early low-cost pitch shifters pe¬ 
riodically switched between the buffers, resulting in 
glitches. Modem low-cost pitch shifters periodically 
pan between the buffers, resulting in flanging effects 
because of the time staggering. The best approach 
would intelligently splice waveforms from a finishing 
buffer to those of a starting buffer. In any case, the 
pitch-shifted signal can then be mixed with the original 
signal. If the pitch shift is small, say 30 cents, the result 
is a thick chorused sound. Intelligibility can be im¬ 
proved by delaying the original signal by an amount 
roughly equal to the mean buffer delay incurred by the 
pitch-shifted signal. 


sound while slightly decreasing intelligi¬ 
bility, although many low-cost pitch 
shifters also introduce additional undesir¬ 
able artifacts.* Peaking works well with 
speech when used to boost amplitude in the 
1-kilohertz range (where a good deal of 
speech phoneme information is carried), 
but is otherwise limited. Distancing gives a 
fuller, mystical sound but reduces intelligi¬ 
bility. Muffling creates impressions of 
confinement or distance but with greatly 
reduced intelligibility. 

Informal experiments and our previous 
experience with these techniques suggest 
we can use self-animation, thickening, and 
peaking to highlight or emphasize an audio 
source, making it more prominent than an 
unprocessed source. Similarly, we believe 
we can use muffling and distancing to 
deemphasize or “background” a source, 
making it less prominent than an unpro¬ 
cessed source. We believe muffling can 
help to acoustically denote a metaphorical 
“grabbing” of an audio source (similar to 
“grabbing” a visual icon with a mouse in a 
visual system). 

We chose the following steps for creat¬ 
ing hierarchies. The highest level of en¬ 
hancement uses a thickening operation 
piped into a self-animation operation. The 
next level uses only self-animation. The 
lowest level presents the native, unpro¬ 
cessed signal. These choices work fairly 
well given the quality of the audio products 
we used, although the difference in empha¬ 
sis between the highest two levels was far 
less than between the lowest two. Thicken¬ 
ing alone compared with self-animation 


alone does not create a well-defined hierar¬ 
chical relation. However, the fact that the 
slightly trained ear can recognize thicken¬ 
ing and self-animation as separate opera¬ 
tions suggests possibilities for orderings 
with two or more indices. (Note that a 
hierarchy has only one index, which is the 
level occupied within the hierarchy.) We 
did not explore this, however, with any 
serious effort. We reserved muffling and 
thinning for use with the window manager 
to provide in-band audio feedback when an 
input device (such as a mouse) singles out 
or grabs a source. 

One-, two-, and three-dimensional 
source distributions. Human hearing has 
remarkable capabilities for extracting 
subtle information from audio sources at 
the same time it extracts directional infor¬ 
mation. 3 5 For example, the same ampli¬ 
tude and pitch pattern can be filtered by a 
speaker’s mouth and throat to say either 
“You went home?” or “He flew where?” A 
listener not only can recognize these ques¬ 
tions effortlessly but also can tell roughly 
where they are coming from. The direc¬ 
tional perception of human hearing ap¬ 
pears to have much to do with binaural 
differences in delay and frequency filter¬ 
ing as produced by environmental and 
outer-ear acoustics.** Further, human 
hearing can use its ability to discriminate 
source direction to sort simultaneous audio 
sources and focus on a specific source, as 
at a cocktail party. 

In our early work 10 we used audio mix¬ 
ers with stereo outputs in an attempt to 
create a one-dimensional spatial distribu¬ 
tion for sounds presented via stereo speak¬ 
ers or headphones. Stereo-output mixers 
use differences in amplitude only and ig¬ 
nore the problem of correctly synthesizing 
natural delay and filtering attributes. 
Amplitude differences work well for cor¬ 
related multiple sources (such as music) 
transmitted through speakers or head¬ 
phones and for a small number of uncorre¬ 
lated sources when used with spatially 
separated speakers. However, the ampli¬ 
tude difference effect creates cacophony 
under more general conditions. We experi¬ 
mented with recording studio techniques 
to make simultaneous sources sound more 


** Binaural differences are those between what the left 
and right eardrums are presented from the end of the ear 
canal. Recognition of different filtering patterns cre¬ 
ated by outer-ear acoustics implies that the hearing 
system works from a set of sound templates from which 


COMPUTER 





















































feedback 


Figure 3. A prototype audio window system. 


separated by changing equalization to 
impose a distinct spectral filtering signa¬ 
ture on each source. Our experiments sug¬ 
gested that what was actually being ex¬ 
ploited here was the hearing system’s 
ability to sort and focus based on its ability 
to distinguish source location. Although 
unorganized filtering and left-right ampli¬ 
tude differences can be combined to create 
crude but usable artificial separations, 10 
we decided to explore properly synthe¬ 
sized spatial sound. One reason for this 
decision was the strong metaphors made 
possible by synthesizing the effect of 
sound sources distributed in three space. 

Three-dimensional spatial sound syn¬ 
thesis has been explored for many years, 
and a variety of techniques and systems are 
available. Spatial auditory cues are pro¬ 
duced in nature by reflections and absorp¬ 
tions about the head, torso, and outer ear, 
and they have been extensively catalogued 
as a function of three-space location with 
respect to the head. Spatial sound synthesis 
uses linear filtering techniques to impose 
(stereo) pairs of transfer functions (one for 
the headphone’s left speaker, one for the 
right) whose amplitude and delay response 
reproduce those of known spatial cues. 
From a signal processing standpoint, each 
transfer function can be imposed by con¬ 
tinuously “shaping” the source audio signal 
with the impulse response corresponding 
to the desired transfer function. Algo¬ 
rithmically, this shaping is done via an 
operation called convolution. For a dis¬ 
crete time signal (such as sampled digital 
audio), a convolution first computes at 


each instant scaled versions of current and 
previous samples of the audio signal. Each 
sample is scaled by multiplying it by a 
specific coefficient from the collection of 
coefficients that represent the discrete time 
impulse response. The scaled samples are 
then added to produce the output sample 
for that particular instant. In general, in¬ 
pulse responses can be infinite (involving 
an infinite number of multiply and add 
functions realized by recursion) or finite (a 
finite collection of multiply and add func¬ 
tions that can be pipelined in parallel). 

The spatial sound system we are using is 
based on the Crystal River Convolvatron, a 
high-throughput (300 million instructions 
per second) audio digital-signal-processor 
convolution engine fitted with four analog 
audio input channels and a stereo audio 
output (see Figure 2). This system time- 
interleaves eight separate 128-stage finite 
impulse response filters in 24-bit arithme¬ 
tic on a single parallel pipeline created 
around a 128-stage delay-line/multiplier 
engine. An additional processor selects, 
retrieves, and interpolates coefficient ar¬ 
rays used by the pipeline to create the 
desired transfer functions. The outputs of 
all left- and right-ear filters are presented 
through stereo headphones (which this 
approach requires). The system can also 
perform real-time coordinate transforma¬ 
tions, allowing not'only the source loca¬ 
tions but also the observer location to be 
adjusted in real-time. 

A McDonnell Douglas Polhemus 
Isotrak gives the headphones’ three-space 
coordinates ( x , y, z; roll; pitch; and yaw) to 


the Convolvatron’s coordinate-transfor- 
mation calculation processor. The result¬ 
ing dynamically selected filters use stereo 
spatial cues to create the psychoacoustic 
imagery of four monaural sound sources 
distributed in three dimensions. The 
sources ’ perceived position with respect to 
the user incorporates the user’s head posi¬ 
tion and orientation at each moment (the 
position can be recalculated in 33 millisec¬ 
onds). Further, the perceived position of 
each sound source can be stationary or 
moving (using periodic coordinate up¬ 
dates). This system is the work of Foster, 
with assistance from Wenzel and Wight- 
man. 11 Although the head-tracking correc¬ 
tions might seem frivolous at first, they 
greatly help system effectiveness. 

The Convolvatron can distribute sound 
sources in one and two dimensions as 
special cases. Even in the ID case, the 
Convolvatron lets the user more clearly 
distinguish and selectively focus on two or 
more uncorrelated sources. Further, the ID 
array of sources can appear in front of the 
user, to the side, etc., as opposed to the 
“inside the head” effect created by tradi¬ 
tional stereo amplitude pans. The 2D case 
has obvious utility for PCs and worksta¬ 
tions, as sounds can be spatially associated 
with specific application windows on the 
screen, while 3D arrays can extend the 
workspace to include regions beyond the 
screen area. 

When headphones can be used, the use 
of 3D audio imaging to create realistic 2D 
and 3D arrangements of sound sources is 
far superior to conventional stereo output 


August 1990 


69 


















































mixing. When stereo output mixing is 
used, intuitive graphical interfaces can 
control the organization of the sound 
sources. If movable audio sources are 
physically associated with the screen loca¬ 
tions of visual windows, audio window 
management can track the effects induced 
by a visual window manager (we have not 
tried to implement this). 

Prototype audio 
window system 

Figure 3 shows our prototype audio 
window system combining hierarchical 


and spatial processing functions with a 
computer controlled switch, software, and 
human input devices. This system uses the 
Convolvatron as a multi-input, stereo-out- 
put audio “mixer.” We can combine two 
unmodified Convolvatrons to channel 
eight inputs into a single stereo output. 
Source position and operator pipeline con¬ 
trol are under computer control in the form 
of the Convolvatron and a crossbar switch, 
respectively. 

Incoming audio signals are presented to 
the switch, where they are either ignored, 
routed without processing to the spatial 
mixing stage (the Convolvatron), or routed 
to hierarchical audio processing modules. 
A first crossbar-switch stage performs 


concentration and permutation functions 
for the hierarchical modules. The resulting 
outputs are then presented to a second 
crossbar-switch stage, where the outputs 
and unprocessed signals are either ignored 
or routed to specific inputs of the spatial 
mixer. The second stage performs concen¬ 
tration and permutation functions for the 
spatial mixer. The system also includes a 
computer-controlled stereo mixer to pro¬ 
vide parallel output for comparisons. 

We are studying two ways to control this 
hardware arrangement using metaphor- 
based “window managers.” The first ap¬ 
proach uses a graphical user interface (X 
Window System) controlled by a mouse 
(see Figure 4). This interface currently 
focuses on 1D spatial control and empha¬ 
sizes the control of hierarchy. The four 
stacked boxes contain icons representing 
audio sources, which the user manipulates 
with the mouse. The bottom box always 
shows icons for all the audio channels, 
which the user can turn on or off with a 
simple mouse click. When a channel is on, 
another icon for it is displayed in one of the 
three upper boxes. When a channel is 
turned on after having been off, its icon is 
displayed in the same box where it was last 
displayed. The three upper boxes each 
denote a distinct type of hierarchical pro¬ 
cessing effect. When an icon is in one of 
the upper boxes, the corresponding chan¬ 
nel is processed by the hierarchical effect 
corresponding to that box. The left-to-right 
position of the icon corresponds to the 
audio channel’s ID spatial position 
through the headphones. The mouse ad¬ 
justs the icon’s location, and hence its 
audio position and level of hierarchical 
effect, as well as the volume. 

Our second approach focuses on 2D and 
3D audio windowing using a VPL Re¬ 
search Dataglove instead of a mouse and 
employing audio cues for user feedback. 
The Dataglove is a flexible fabric glove 
fitted with optical strain gauges for meas¬ 
uring hand postures and an Isotrak for 
measuring hand position in three space. 
Our Dataglove approach uses both point¬ 
ing and hand-gesture recognition. 12 Point¬ 
ing in the perceived direction of an audio 
source singles it out to be turned on or off, 
moved, or changed in the hierarchy. Grasp¬ 
ing and releasing motions move an audio 
source from one perceived location to 
another. As feedback, audio signal pro¬ 
cessing “thins” a source when the user 
points at it, and muffles a source when the 
user “grabs” it. Holding out one, two, or 
three fingers changes the source’s hierar¬ 
chy level. 


70 


COMPUTER 










































Envisioned 

implementations 

Terminal-based audio window systems 
let the user manage audio sources locally. 
This ability is particularly appropriate if 
many of the applications are run entirely 
on the terminal. Publicly shared network- 
based audio window servers, on the other 
hand, also have some advantages: They can 
be structured to give users “virtual” audio 
window hardware whose complexity var¬ 
ies as needed, they work together with 
telecommunications services that produce 
or carry audio, and they share the cost of a 
high-quality system among a potentially 
large community. There could be a signifi¬ 
cant need for both architectures and many 
cases in which they must work together. 

Our preliminary work suggests that an 
effective audio window system needs 
much less complexity and fewer levels of 
digital-signal-processing precision than 
our current prototype. With appropriate 
metaphors, low-cost techniques can handle 
human input to audio window managers. 
Our work plan focused on techniques first, 
applications next, and simplifications last. 

Terminal-based system. In this ap¬ 
proach, all needed digital signal process¬ 
ing would be done in the terminal, PC, or 
workstation. Such a system would use an 
architecture like that in Figure 3 and could 
use Vox. 2 A terminal-based approach gives 
users access to audio windowing without 
connecting to a network. 

It is unclear if digital signal processing 
and switching for audio windowing could 
be folded into future desktop computing 
products, but we expect it to contribute at 
least some additional cost. This cost, plus 
the other functional needs of network- 
based systems, motivate the network- 
based server design. 

Network-based server. Figure 5 shows 
an architecture for a network-based audio 
window server. Digital signal processing 
and configuration switching resources are 
allocated as needed from a shared pool. 
This arrangement allows systems to be 
created on demand with widely variable 
numbers of channels. It also takes advan¬ 
tage of per-resource utilization gains in¬ 
herent in shared-resource systems. An¬ 
other advantage is the sharing of common 
signals in group applications. For example, 
in teleconferencing, users might have all 
processing parameters in common except 
their individual head positions, in which 
case the middle switch in Figure 5 simply 



fans out the same signals from the previ¬ 
ous stages to multiple instances of spatial 
processing. In integrated-service net¬ 
works, the network can provide audio 
window functions to manage the multiple 
audio streams that would be sent to the 
user. This reduces the number of audio 
channels sent to each user to two and 
gives terminals with limited audio fea¬ 
tures (such as telephones and current PCs 
with audio capabilities) access to audio 
windowing. 

Working together. There may be cases 
where both methods must be used simulta¬ 
neously. For example, a user might want 
to impose local audio sources atop the 
signals provided by a network-based au¬ 
dio window server. 

Our system lets terminal and network- 
based methods work together by exploit¬ 
ing points of linearity in the audio window¬ 
ing processing pipeline and by exploiting 
points between processing stages. Since 
the final stage involves a linear mix of the 
outputs from several spatial processing 
sections, the final mixing stage can be 
split into a network-server part and a ter¬ 
minal part. The outputs from the network- 
server part are transmitted through the 
network to the terminal, where they be¬ 
come inputs to a final mixer in the termi¬ 
nal. This mixer can also receive the out¬ 
puts of local spatial-processing stages in 
the terminal. To exploit points in the pipe¬ 
line, hierarchical processing can be in¬ 
voked separately from the spatial process¬ 
ing. (Note that the order of hierarchical 
and spatial processing is most likely not 
interchangeable except with careful de¬ 


sign.) The development of protocols to let 
audio window systems work together at 
control level remains an unexplored area. 


T he possibilities and potential for 
audio windowing create many 
needs and opportunities for future 
work. For example, we have developed but 
not tested an audio cursor, modes that have 
sources in motion, audio icons, very dis¬ 
tant sources, and multi-index extensions of 
the one-index hierarchy. A good deal of 
formal subjective testing is needed to study 
such issues as user fatigue and the effec¬ 
tiveness of techniques and metaphors. A 
psychologist has worked out a subjective 
testing form and procedure, but we have 
not been able to perform the testing yet. 
Audio window systems invite hardware 
development as well. Better pitch shifters 
(without glitches or flanges) are needed, 
while very large scale integrated circuits 
could make audio windowing accessible to 
terminal designers and value-added net¬ 
work-service providers. 

An important area still unexplored is the 
treatment of bursty (sporadically active) 
audio sources. With or without a comple¬ 
mentary graphical display, some method is 
often needed to establish the position of 
these sources when they are dormant. One 
approach might deliberately introduce 
low-level background noise as a continu¬ 
ous reminder of the bursty source’s exis¬ 
tence. Rare audio events signaling some 
condition that disappears when acknowl¬ 
edged (such as an alarm or a phone ring) 
could be treated as a pop-up audio 
window. ■ 


August 1990 


71 



























Acknowledgments 

We thank Ying Wu for computer interface 
assistance (hardware and drivers), Scott Foster 
for Convolvatron support, Renee Doctor for 
subjective testing methods, and Mark Levine, 
Damon Antos, and Sarah Martin for X. 11 help. 
We also thank the editors and Mary Lea Crawley 
for their extensive editorial assistance. 


References 

1. “Multimedia Stirs a Revolution in Corpo¬ 
rate US: Apple, IBM, Others Rush into New 
Market,” PC Week , Vol. 6, No. 35, Sept. 4, 
1989, p. 109. 

2. B. Arons et al., “The Vox Audio Server,” 
Multimedia 89, IEEE. 

3. W. Slawson, Sound Color, University of 
California Press, Berkeley, 1985. 

4. J. Blauert, Spatial Hearing, MIT Press, 
Cambridge Mass., 1983. 

5. G. Kendall and W. Martens, “Simulating the 
Cues of Spatial Hearing in Natural 
Environments,” Proc. 1984 Int’l Computer 
Music Conf., Paris, pp. 111-125. 


6. R.A. Bolt, The Human Interface, Van Nos¬ 
trand Reinhold, New York, 1984. 

7. I. Greif, Computer-Supported Cooperative 
Work, Morgan Kaufmann, San Mateo, 
Calif., 1988. 

8. L. Ludwig and D. Dunn, “Laboratory for 
Emulation and Study of Integrated and 
Coordinated Media Communication,” Proc. 
ACM SIGComm, 1987. 

9. L. Pate and R. Lake, “A Network Environ¬ 
ment for Studying Multimedia Network 
Architecture and Control,” Proc. Globecom 
89, IEEE, pp. 1,232-1,236. 

10. L. Ludwig and N. Pincever, “Audio Win¬ 
dowing Realization,” tech, report, to be 
submitted to IEEE Acoustics, Speech, and 
Signal Processing (contact Ludwig at the 
address below). 

11. E. Wenzel, F. Wightman, and S. Foster, “A 
Virtual Display System for Conveying 
Three-Dimensional Acoustic Information,” 
32nd Human Factors Soc. Proc., 1988, pp. 
86-90. 

12. M. Cohen and L. Ludwig, “Multidimen¬ 
sional Audio Window Management for 
Spatial Sound,” to be published in Int’l J. 
Man Machine Studies. 



Lester Ludwig is principal scientist at Vicor, 
Inc. Previously, he established Bell Communi¬ 
cations Research’s Integrated Media Architec¬ 
ture Laboratory in 1986. His interests include 
multimedia telecommunications, stochastic 
control of resource allocation, scientific visuali¬ 
zation, signal processing, human interface sys¬ 
tems, and wide-area wide-band networking. 

Ludwig holds a PhD in electrical engineering 
and computer science from the University of 
California at Berkeley, and MS and BS degrees 
in electrical engineering from Cornell Univer- 



Natalio Pincever is working on his MS degree 
in media technology at the Massachussetts Insti¬ 
tute of Technology’s Media Laboratory. His 
research interests include digital audio process¬ 
ing, desktop audio, human interfaces, and multi- 
media communications. 

Pincever holds a BS degree in electrical engi¬ 
neering from the University of Florida. He is a 
member of the IEEE Computer Society. 



Michael Cohen is a PhD student in the Electri¬ 
cal Engineering and Computer Science Depart¬ 
ment at Northwestern University, where he is 
researching spatial sound with the Computer 
Music Group. His interests are in telecommuni¬ 
cations semiotics, including stereotelephonics, 
digital typography, and hypermedia. 

Cohen received his BS in electrical engineer¬ 
ing from Brown University and his MS in 
computer science from the University of Wash¬ 
ington. He is a member of ACM. 


Readers can contact Lester Ludwig at Vicor, 
525 University Ave., Suite 1307, Palo Alto, CA 
94301. 


Computer Scientists 

Lawrence Livermore National Laboratory, one of the nation’s premier R&D 
organizations, has a number of opportunities for individuals in our Applica¬ 
tions Development Department. 

Using state-of-the-art tools and supporting both classified and un-classified 
scientific and engineering projects, your assignments include real-time soft¬ 
ware development, image processing, simulation, hardware control/interfac¬ 
ing, software development for celestial navigation, acquiring objects from 
image data and tracking objects. Other areas include databases, network¬ 
ing computer security, electronic commerce, and combat simulation. Re¬ 
quirements include a BSor MS degree in Computer Science or comparable 
experience in a related field. Active DOE Q clearance is desired. Must have 
ability to work well in a team-oriented environment with good verbal and writ¬ 
ten communication skills. Familiarity with C and UNIX is desirable. Related 
experience in celestial navigation, target acquisition and tracking, experimen¬ 
tal hardware control, mathematical modeling, numerical methods, simula¬ 
tion networking, databases, or Macintosh programming will be considered. 
LLNL offers a highly competitive salary and benefits package including three 
weeks vacation per year. The Laboratory is located in the Livermore Valley, 
45 minutes southeast of San Francisco. 

Interested applicants should send their resume to: Patricia Butler, Profes¬ 
sional Employment, Lawrence Livermore National Laboratory, P.0. Box 
808, Dept. A90319, Livermore, CA 94550. U.S. Citizenship required. Equal 
Opportunity Employer. 

University of California 

III Lawrence Livermore 
National Laboratory 


72 


COMPUTER 















PX: Supporting Voice 
in Workstations 


Ragui Kamel, Kamyar Emami, and Robert Eckert 
Bell-Northern Research 


V oice communication is merging 
with computer applications. On 
the one hand, the emergence of the 
Integrated Services Digital Network 
(ISDN) and the increasing availability of 
computer interfaces to telephone switches 
enable computers to exploit voice and 
voice communications. On the other hand, 
call-processing features and voice applica¬ 
tions that complement the telephone net¬ 
work are being built on computers. 

The desktop workstation is a major 
area where communication meets applica¬ 
tions. Today’s workstations already have 
the basic capability to support voice. Pro¬ 
cessor speed, bus bandwidth, and I/O in¬ 
terface speeds are high enough to com¬ 
fortably manage a large number of voice 
operations. Commonly found 30-mega- 
byte local Winchester disk drives can hold 
an hour of uncompressed telephone- 
quality voice. Newer workstations are 
providing digital signal processing (DSP) 
chips as a standard component for speech 
processing. 

In many ways, the situation is analogous 
to the time when the ready availability of 
low-cost personal graphics devices made 
the extensive use of graphics in worksta¬ 
tions possible. However, availability of 
devices alone was not sufficient for wide¬ 
spread use of graphics. Architectures and 
toolkits that support higher level window¬ 
ing and graphic abstractions were needed 
to stimulate extensive use of graphics in 
applications. We believe the same will be 
true for voice. Voice will be common in 
workstations when architectures that al- 

August 1990 


The Personal 
Exchange (PX) 
research project 
explores an 
architecture to 
provide personal 
workstation users with 
dexterity in 
manipulating voice. 
This article describes 
PX concepts and an 
initial implementation 
of the architecture. 


low easy manipulation of voice come into 
existence. 

The goal of the Personal Exchange (PX) 
research project is to explore architectures 
that provide workstations with the same 
dexterity in communicating, storing, re¬ 
trieving, and processing voice as already 
provided for data and graphics. The PX 
project is conducted at the Bell-Northern 
Research (BNR) Computing Research 
Laboratory (CRL). 

0018-9162/90/0800-0073J01.00 © 1990 IEEE 


Applications. A multitude of applica¬ 
tions become possible once voice is inte¬ 
grated with computing. They include 
computer assistance for voice communica¬ 
tion, integration of voice and data in docu¬ 
ments, and voice calling/response applica¬ 
tions. 

Computer assistance for voice commu¬ 
nication. This involves using the extensive 
man-machine interface and logic capabil¬ 
ity of computers to enhance a user’s han¬ 
dling of voice communication. 

The simplest example is to provide fast 
dialing by clicking on a name contained in 
one of the multiple telephone directories. 
Some of these directories (for example, 
company directories) are generic while 
others are personal with new names added 
via simple form fill. Additional features 
would include call logging (with the abil¬ 
ity to annotate calls with text or voice 
memos) and iconic access to telephony 
features. Most people never use telephone 
features such as call forwarding, since 
these features have complicated user inter¬ 
faces (for example, hit a special feature 
key on the telephone, enter the feature 
number, and enter a forwarding telephone 
number). In a PX environment, you drag 
the telephone icon onto a directory entry 
for the person you wish to be forwarded to. 

More important than assistance in mak¬ 
ing telephone calls is assistance in answer¬ 
ing them. More than half of all business 
telephone calls do not result in conversa¬ 
tions with the intended people and lead to 
“please call me back” messages. “Tele- 

73 










The PX Answering Machine 

The PX Answering Machine is one of the most useful PX 
applications to date. Its development history also illustrates 
how PX availability has stimulated new ideas and their rapid 
implementation. 

The first PX Answering Machine behaved like a conven¬ 
tional answering machine. It detected an incoming call, an¬ 
swered after three rings, played a greeting, and recorded a 
message. Messages were displayed in a message log that 
could be edited. 

The PX Answering Machine was deployed in the CRL. 
Within two months, many features were added at users’ re¬ 
quests. The features (see the figure) include 

• Selectable default greetings, where the user could pre¬ 
record several greetings (default greetings in the Configura¬ 
tion window) and select the greeting to be currently active by 
“point and click.'’ 

• Time-based greetings, where the selection of the active 
default greeting is based on the time of day. For example, 
between 11:30 a.m. and 12:30 p.m., the default greeting 
could say the user is at lunch; and after 5:30 p.m., the de¬ 
fault greeting could say the user has gone home. 


• Calling-line-based greetings, where a personal message is 
delivered if the incoming call originates from a particular calling 
number. However, since the calling number does not guaran¬ 
tee a person, speaker verification for very confidential mes¬ 
sages was added to the feature. One such greeting is shown in 
the “Greeting - peter” window in the figure. 

• Code-based greetings, a variation on calling-line greetings 
for use when calling-line is not available. Here, the user gives 
specific codes to people likely to receive personal messages 
(such as superior, secretary, spouse, squash partner). The 
caller uses DTMF key presses to enter the code and retrieve 
the personal greeting. 

• Paging for urgent messages, to alert the user when an ur¬ 
gent call arrives. An urgent call can be defined as one originat¬ 
ing from a specific number, one where a specific code is en¬ 
tered, or one during which the caller tells the answering ma¬ 
chine that the call is urgent. 

While it is worthwhile to experiment with these features and 
assess their impact on improving communication, PX’s value 
lies in its ability to provide an environment that stimulates rapid 
development and experimentation with such applications. 


Features of the PX Answering Machine. 



± 

JL 

£ 


phone tag” has become the only sport in 
which many people partake. Computer- 
based answering assistants that use caller 
identification to deliver personalized mes¬ 
sages, route calls to other numbers, or even 


page for urgent calls can dramatically 
improve communication and reduce user 
frustration. One such assistant is described 
in the accompanying “PX Answering 
Machine” sidebar. Examples of other re¬ 


search systems that provide this type of 
assistance include the MIT Phone Slave, 1 
the MIT Conversational Desktop, 23 and 
the Etherphone 4 from the Xerox Palo Alto 
Research Center. 


74 


COMPUTER 




















































Figure 1. Hardware configuration of the PX system. 


Integration of voice and data in docu¬ 
ments. The ability to include and transmit 
voice as part of documents has many uses. 
For example, a reviewer will find it more 
effective to provide lengthier comments 
on a paper by voice. Another example 
(developed in PX) consists of updating a 
calendar with voice descriptions of the 
topic and agenda of a meeting. Finally, a 
very useful application is making presen¬ 
tations without being present by annotat¬ 
ing each viewgraph with the appropriate 
voice commentary. A good example of a 
multimedia system that integrates voice 
with data is the BBN Diamond project. 5 

Voice callinglresponse applications. 
These applications use voice either to de¬ 
liver messages or to respond to requests. 
An obvious commercial example of voice 
calling is telemarketing, involving dialing 
telephone numbers from a database and 
playing a marketing message. A more 
personal use of the same concept entails 
the automatic calling of a group of people 
to deliver a message that a meeting is about 
to start, that its time has been changed, or 
that it has been canceled. 

Voice response is the other side of the 
same coin. Here, a user phones and ac¬ 
cesses information using a mixture of voice 
and dual-tone multifrequency key presses. 
In the calendar example above, you could 
book an appointment by selecting the date, 
time, and duration using keys and by pro¬ 
viding the topic and agenda via voice 
annotation. Other examples include voice 
access to databases. An important subcase 
of this application involves having your 
text e-mail messages read remotely over 
the telephone. 

Goals of the PX project. Part of the PX 
project consists of building some of the 
above applications. However, this is done 
mostly as a way of validating the voice- 
manipulation architectures. In that sense, 
our goals are similar to those of the Ether- 
phone project, 6 the Bellcore MICE proj¬ 
ect, 7 and the Olivetti Vox audio server 
project. 8 

In contrast to Vox, we are not only con¬ 
cerned with the voice-connection architec¬ 
ture internal to a single computer but also 
with the voice-switching architecture be¬ 
tween computers. Unlike Etherphone and 
MICE, we are migrating and distributing 
the call-processing functionality to desk¬ 
top workstations instead of providing it in 
a central server. We believe the distributed 
architecture will allow faster development 
and customization of advanced call-pro¬ 
cessing features. 


PX hardware 

Figure 1 shows the hardware configura¬ 
tion of the PX system, including worksta¬ 
tion, telephone, and circuit server. 

Workstation and telephone. Each PX 
user has a workstation and an associated 
telephone set. The telephone acts as a 
microphone and speaker and digitizes 
voice for ISDN-compatible telephony. The 
telephone also provides a basic fall-back 
telephony service whenever the worksta¬ 
tion telephone management is not opera¬ 
tional. 

The workstation runs user voice appli¬ 
cations. In keeping with the distributed 
nature of the architecture, the telephony 
functions also run in the workstation when¬ 
ever possible. For example, such advanced 
telephony features as call forwarding are 


implemented as a workstation service, 
using the circuit server only to connect 

Circuit server. The voice circuit server 
is used to make voice connections and to 
interface to the public telephone network. 
The PX architecture’s major requirement 
on the circuit switch is that it provide an 
open external control interface. Unfortu¬ 
nately, open switches remain rare, and it is 
this scarcity that led the Etherphone group 
to design its own switching system. 4 For 
PX, using a Northern Telecom ISDN- 
compatible key system meets the open 
interface requirement. 

Workstation adaptor. Shown in Figure 
2, the workstation adaptor allows the work¬ 
station to control its associated telephone 
and the circuit server. It also passes digit¬ 
ized voice among all three elements. On 



August 1990 


75 












































Figure 3. PX connection architecture. 


one side, the adaptor connects to a switch 
telephone line; on the other, it plugs into an 
RS-422 port of the workstation. The tele¬ 
phone set is also connected to the adaptor 
and is usually controlled by the worksta¬ 
tion. However, whenever the workstation 
is inoperative, the adaptor will by default 
connect the telephone to the switch, thus 
causing it to operate as a traditional tele¬ 
phone set. 

Within the workstation, a device driver 
delivers packets to and from the worksta¬ 
tion adaptor. A packet can contain voice (B 
channel data), switch control (D channel 
message), or adaptor control data (control 
messages between workstation and adap¬ 
tor). 

In the interest of an open architecture, 
the workstation adaptor is designed as an 
external, separately powered box that 
plugs into any workstation’s high-speed 
RS-422 interface. It is also possible to 
modify the adaptor to interface to a new 
switch without significantly affecting the 
workstation software. We have built a 
version of the workstation adaptor that 
interfaces to a standard analog telephone 
line. 

The workstation adaptor contains 32 
kilobytes of memory for voice. The divi¬ 
sion of that memory into buffers is config¬ 
urable through downloadable firmware, to 
permit different workstations and operat¬ 
ing systems to select the right balance 
between CPU efficiency and tolerance for 
interrupt latency. In the current configura¬ 
tion, voice is transferred between the work¬ 
station adaptor and the workstation at up to 
100 kilobytes per second in 1-kilobyte 
buffers (125 milliseconds of voice). 


PX architectural 
concepts 

Voice communication basically entails 
establishing connections along which 
voice flows. Thus, the fundamental ma¬ 
nipulation the PX architecture enables is 
establishing connections. The origin and 
destination of a connection is always a PX 
device. Typical devices are telephone sets, 
telephone lines, speech synthesizers, and 
recognizers. Devices need not be physical 
voice encoding or decoding devices. For 
example, a voice-storage toolkit is not a 
physical device but consists of software 
that stores and retrieves voice from special 
format files on a standard file system. 

A PX application does not handle voice 
directly or provide connection endpoints. 
It performs its task by instructing devices 
to connect to each other. For example, an 
answering machine establishes connec¬ 
tions between the voice-storage device and 
the telecom-line device to play announce¬ 
ments and record messages. 

Applications interact with devices by 
using messages to establish connections or 
to receive notification of external events. 
In the answering-machine example, mes¬ 
sages are used to establish the recording 
connections described above. They also 
indicate when a call' comes in, when to 
respond, and when to hang up. 

The concept of messages and connec¬ 
tions extends readily from a single work¬ 
station across multiple workstations. For 
example, a telephone call is an exchange of 
messages culminating in the connection of 
two telephone devices, each belonging to a 


separate workstation. This architecture is 
illustrated in Figure 3. 

Connections. Voice flows between the 
end point devices of a connection with 
minimal involvement from applications. 
Some connections are realized entirely in 
hardware. For example, during a telephone 
call, voice flows through a physical con¬ 
nection from the switch to the telephone- 
set speaker; the voice stream is not ma¬ 
nipulated by software. In other cases, soft¬ 
ware involvement is necessary. For ex¬ 
ample, playing back recorded voice re¬ 
quires that the voice-storage subsystem 
read the digitized voice from a disk and 
transfer it along a connection to its destina- 

Software also needs to be involved if the 
voice must be converted from one digital 
format to another. For example, playing 
telephone-recorded voice over the work¬ 
station speaker may require a conversion 
of voice format from p-law PCM to linear 
pulse code modulation (PCM). 

Note that a connection need not be point- 
to-point. Recording a conversation re¬ 
quires four connections: one each from the 
telephone-set microphone to the telephone 
line and to the voice-storage toolkit, and 
one each from the telephone line to the 
telephone-set speaker and to the voice- 
storage toolkit. 

Messages. Messages are the control 
mechanisms in the PX architecture. They 
are used within a workstation to establish 
connections between devices. Messages 
logically extend across a PX system to 
negotiate connections between different 


76 


COMPUTER 
































































Originator phone manager 

Receiver phone manager 

1. Broadcast a message to all 
workstations indicating desired 
directory number. 

2. Recognize incoming call to own 
directory number. 


3. Send message to originator indicat¬ 
ing user is being alerted. 


4. When user responds, send message 
to originator indicating that the call 
is accepted. 

5. When call is accepted, instruct 
switch to make voice connection. 

5. Instruct switch to make voice 
connection. 


Figure 4. Message exchange protocol to establish a call. 


workstations. The extension can be carried 
further to encompass geographically dis¬ 
persed PX systems. 

The classic case of using messages is to 
establish a connection for a telephone call 
between two workstations. Each worksta¬ 
tion has a directory number (DN) that 
uniquely identifies it. The sequence of 
messages exchanged between the telephone 
managers of each workstation to establish 
the connection is shown in Figure 4. 

Additional messages are exchanged 
within a workstation. For example, to alert 
the user, incoming call messages are seen 
by both the telephone-set manager (for an 
audible telephone-set ring) and the screen 
phone (for a visual prompt). If the user 
picks up the handset, the telephone-set 
manager indicates this by transmitting a 
message to the telecom-line manager. Al¬ 
ternatively, if the user answers from the 
workstation through the screen phone, a 
message is sent from the screen phone to the 
telephone-set manager to request hands¬ 
free use in addition to notifying the tele- 
phone-set manager. 

Messages are broadcast over a logical 
message bus. Thus, the sender is not con¬ 
cerned with the target of the message. A 
typical case is a telecom-line device that 
wishes to indicate an incoming call but does 
not need (or want, for that matter) to know 
if the call is handled by a telephone-set 
manager, a screen-phone manager, or an 
answering machine. If one of these applica¬ 
tions responds to the incoming call mes¬ 
sage, then it too broadcasts on the message 
bus to inform the telecom-line device and 
any other potentially interested applica¬ 
tions so that they can take appropriate ac¬ 
tion. 

The physical message bus between work¬ 
stations is implementation dependent. 
The telephone line D channel supported by 
the switch is used in the initial implementa¬ 
tion. However, the architecture allows 
messages to be transmitted over any packet 
network. 

Software structure 

Figure 5 shows the PX software struc¬ 
ture. While PX applications have free ac¬ 
cess to devices, we have found it useful to 
bundle common operations and offer them 
at a higher level of abstraction as toolkits. 
Another advantage of toolkits is that they 
isolate applications from the evolutionary 
chum of the message bus protocols.The 
following toolkits have been designed in 
PX: 


• A call-processing toolkit that provides 
a higher level interface to switching 
and call-processing functions. 

• A voice-storage toolkit that allows the 
recording, playing, and editing of 
voice files. This toolkit also includes 
pause detection and voice-format con¬ 
version. 

• Voice-processing toolkits that gener¬ 
ate and process voice, including text- 
to-speech synthesis, speaker verifica¬ 
tion, identification, and recognition. 

Call-processing toolkit. Call-process¬ 
ing toolkit functions include 

• call processing and extended tele¬ 
phony, such as dial call, answer call, 
release, reject call, forward, hold, ring 
again, do not disturb, transfer, and 
conference; 


Figure 5. The PX software structure. 


• telephone-set control, such as set ring 
type and volume control; and 

• operations, administration, and main¬ 
tenance (OA&M) features, such as 
query for active calls, query state, and 
query subscribed features. 

In the initial implementation of PX, the 
call-processing toolkit is implemented by 
accessing the features of the key system 
over the D channel. However, we are cur¬ 
rently implementing distributed call pro¬ 
cessing negotiated by messages between 
workstations. In this implementation, the 
switch is only used for circuit switching. 

Voice-storage toolkit. Functions of the 
voice-storage toolkit include 

• Voice recording and playback. These 


f \ 

Telephone 

{ \ 
Screen 

; V 

Answering 

( 1 
Calendar 

Voice 

manager 

V ) 

telephone 

V y 

machine 

V J 

program 

V J 

editor 

v / 


Call 

Voice 1 

1 Speech 1 

1 Speech 1 

1 processing | 

storage | 

| synthesis | 

| recognition | 


1 Telephone 1 

1 Telephone 1 

1 Workstation 1 

set 

line 

j peripherals [ 


August 1990 


77 













































PX voice files 


A PX voice file consists of three sections, as illustrated in 
Figure 1. The header section contains bookkeeping informa¬ 
tion such as the number of clips in the file, the amount of 
space they occupy, and a pointer to free space. The control 
section contains one clip descriptor for each voice clip in the 
file. A clip descriptor includes a pointer to an ordered linked 
list of the voice-interval descriptors comprising the clip. Inter¬ 
val decriptors follow the voice-clip table. Each interval de¬ 
scriptor points to the start and end of a voice sample and 
links to the next interval of the clip. The data section con¬ 
tains the actual voice samples stored in standard telephony 
encoding. 

Only a few voice operations, basically Record and Copy, 
affect the data section. Other operations do not alter data 
samples but manipulate voice-interval records. For example, 
consider a voice file consisting of two clips, VC1 and VC2, 
as shown in Figure 2. 

Creating a third voice clip that concatenates a portion of 


VC1 with a portion of VC2 is done by calling the voice-toolkit 
function voice_concat as shown below: 

VC3 := Voice_Concat (VC1 [Interval], 

VC2[lnterval]) 

This will cause a new voice-clip entry to be created with its 
voice interval list pointing at the desired data. Voice samples are 
not modified. Figure 3 shows the file layout after concatenation. 

Replacing a portion of VC1 with VC2 is done by the following 
call: 

VC1 := Voice_Replace(VC1 [Interval], 

VC2) 

This involves splitting VC1 into three intervals, as shown in 
Figure 4. The first and last intervals point to voice samples that 
have not been modified from the original VC1. The second inter¬ 
val points to the voice samples that made up VC2. 


Samples size; next voice clip ID; number of clips; 
free-space pointer, last block, ... 



| Interval | Next pointer j | Interval | Next pointer j | Interval | Next pointer j 

Interval Next pointer Interval Next pointer Interval Next pointer] 


Figure 1. A PX voice file consists of three sections: header, control, and data. 


operations can specify the source or target 
set of devices (such as telephone set, tele¬ 
phone line, or both). Recording options 
include the ability to record until a speci¬ 
fied length of silence is detected. Playing 
options include playing only a portion of a 
voice clip. The function Stop can be used 
to asynchronously stop any recording or 
playing in progress. 

- Voice editing and manipulation using 


operations such as Concatenate, Replace, 
and Delete. There are also operations to 
load the voice samples into application 
memory for more direct manipulation by 
an application (for example, mixing). 

• Voice-encoding operations that con¬ 
vert voice from one format to another, for 
example from telephony standard (p-law 
PCM) to workstation speaker format (lin¬ 
ear PCM). There are also routines to re¬ 


duce storage requirements by compressing 
the voice. 

Voice files are stored as conventional 
files on the workstation’s file system. Such 
files can reside on a user’s local disk or on 
a shared remote file server. A voice file 
contains multiple voice clips, which are in 
essence a single voice recording. Applica¬ 
tions will typically store a limited number 


78 


COMPUTER 




































- 


VC1 □ 


Speech 

samples 


Figure 2. Two voice clips. 


| | VC2 




of voice clips per file. For example, an 
answering machine application might 
store each day’s messages in a separate 
file. 

Our voice file structure resembles that 
of the Etherphone project. 9 The major dif¬ 
ference is that we use files rather than a 
database to store voice. Exploiting files of 
the workstation’s native file system means 
PX is not concerned with such issues as file 


management, copy and delete utilities, 
garbage collection, server-load manage¬ 
ment, and backup. The “PX voice files” 
sidebar contains a brief description. 

Voice-processing toolkits. Additional 
services provided as PX toolkits include 
speech synthesis, speaker verification, 
speaker identification, and speech recog- 


Speech synthesis is necessary for appli¬ 
cations where it is impossible or impracti¬ 
cal to prerecord the necessary voice. A 
prime example would involve users ac¬ 
cessing their PX workstation over the tele¬ 
phone to have e-mail messages read to 

Speech synthesis is provided in PX as a 
toolkit that accepts a text string and plays 
the resulting encoded speech along a voice 
connection. Initially, we are providing a 
shared server using Digital Equipment’s 
DECtalk speech synthesis system. 10 How¬ 
ever, in keeping with PX’s goal, we will 
eventually run the service on the worksta- 

Speaker verification is used in PX as a 
security measure for remote telephone 
access to one’s PX workstation. It is also 
used in the answering machine (see the 
“PX Answering Machine” sidebar) to ver¬ 
ify the identity of crucial callers before 
delivering personalized messages to them. 
Speaker verification is implemented in 
software using a template-matching algo¬ 
rithm. 

Speaker identification is used in the 
answering machine to identify a small 
number of crucial callers to whom one 
wants to give personalized responses (for 
example, boss, spouse, or stockbroker). 
Unlike ISDN, which only provides the 
calling number, speaker identification 
recognizes the calling person. Speaker 
identification is implemented in software 
using a new search heuristic. 

Speaker-dependent speech recognition 
is not yet implemented in PX. We intend to 
use it to access PX features from a remote 
telephone. While the same functionality is 
achieved through DTMF key presses, we 
believe that limited speech recognition 
will make the interface easier to use. 

The PX voice editor. The voice-editor 
tool is a major PX application that allows 
real-time voice editing of speech samples 
using graphics to display voice data. The 
PX voice editor is a two-level editor. The 
first level deals with manipulating entire 
voice clips. Clips can be recorded, played, 
deleted, moved, renamed, etc. The second 
level allows editing an individual clip. The 
clip is displayed either in a form showing 
its sound and silence patterns or as graphi¬ 
cal waveforms. The user can perform edit¬ 
ing operations such as cutting, copying, or 
pasting. Figure 6 displays an edit session. 

Like Etherphone, 9 few editing opera¬ 
tions are performed within a voice clip. In 
our experience, users perform insertion, 
replacement, and deletion operations on a 


August 1990 


79 








































Figure 6. Display of a voice editor. 


full-sentence or phrase basis rather than on 
a single-word basis. This is done by ma¬ 
nipulating full clips in the upper window. 
However, a user can edit the contents of a 
clip by either manipulating the sound- 
silence form (bottom-right window) or the 
more detailed sound waveform (bottom- 
left window). 

To facilitate editing, the user can also 
associate text strings with particular posi¬ 
tions within a clip; the editor also supports 
timed scan and play, (that is, skip ahead x 
seconds and play y seconds) and fast for¬ 
ward (that is, playing voice at a faster rate 
with constant pitch). 


T he goal of the PX research project 
is to investigate computing archi¬ 
tectures that provide personal 
workstations with dexterity in manipulat¬ 
ing voice equivalent to today’s dexterity in 
manipulating text and graphics. The devel¬ 
opment and use of the applications we are 
building provide insight into our primary 
focus: the architecture for a personal, dis¬ 
tributed, and open environment where 
voice applications can be built and 
customized. ■ 

Acknowledgments 

Ian Bowles designed the PX workstation 
adaptor and voice-transfer protocol, Larry 
Brunet, the PX answering machine, and Anne- 


Lise Hassenklover and Jean Jervis, the voice 
editor. Bill Williams, Liam Casey, and Peter 
Cashin developed the distributed telephony 
architecture. 


References 

1. C. Schmandt and B. Arons, “Phone Slave: A 
Graphical Telecommunications Interface,” 
Proc. 1984 Int'ISymp. Society for Informa¬ 
tion Display , June 1984. 

2. C. Schmandt, “Voice Interaction in an Inte¬ 
grated Office and Telecommunication 
Environment,” Proc. American Voice In¬ 
put/Output Society Conf, 1985. 

3. C. Schmandt and M. McKenna, “An Audio 
and Telephone Server For Multi-Media 
Workstations,” Proc. Second Int’l Conf. 
Computer Workstations, CS Press, Los 
Alamitos, Calif., Order No. 810, Mar. 1988, 
pp. 150-159. 

4. D. Swinehart, “Telephone Management in 
the Etherphone,” Proc. Globecom 87, IEEE 
Communications Society, Nov. 1987, pp. 
392-402. 

5. R.H. Thomas et al., “Diamond: A Multime¬ 
dia Message System Built on a Distributed 
Architecture,” Computer, Dec. 1985, Vol. 
18, No. 12, pp. 65-78. 

6. P. Zellweger et al., “An Overview of the 
Etherphone System and Its Applications,” 
Proc. Second IEEE Conf. Computer Work¬ 
stations, CS Press, Los Alamitos, Calif., 
Order No. 810, Mar. 1988, pp. 160-168. 

7. G. Herman et al., “The Modular Integrated 
Communications Environment (MICE): A 


System for Prototyping and Evaluating 
Communications Services,” Proc. Int’l 
Switching Symp., Phoenix, Ariz., Mar. 
1987, pp. 442-447. 

8. B. Arons et al., “A Voice and Audio Server 
for Multimedia Workstations,” Proc. Multi- 
media 89, Montebello, Ont., Canada, May 
1989. 

9. D. Terry and D. Swinehart, “Managing 
Stored Voice in the Etherphone System,” 
Proc. 11th ACM Symp. Operating Systems, 
Nov. 1988, pp. 48-61. 

10. D.H. Klatt, "How Klatttalk Became DEC- 
talk,” Proc. Speechtech 87. 


Ragui Kamel’s biography and photograph 
appear at the conclusion of the Guest Editor’s 
Introduction. 



Kamyar (Kami) Emami joined Bell-Northern 
Research in 1983 and has worked on distributed 
file systems and managed the Data Communica¬ 
tions and Software Development Environment 
groups. He is currently the manager of the Per¬ 
sonal Communications Architecture group in 
the Computing Research Laboratory. 

Emami received his BS and MS in mathemat¬ 
ics from the University of Salford, England, in 
1980 and 1982, respectively, and his MMath 
degree in computer science from the University 
of Waterloo, Canada, in 1983. 



Robert Eckert joined Bell-Northern Research 
in 1983 and worked on the Northern Telecom 
Meridian mail voice-messaging system. He 
then joined the Computing Technology explora¬ 
tory group responsible for evaluating computer 
architectures and RISC technology. Currently, 
he is a member of the Personal Communication 
Architecture group in the Computing Research 
Laboratory. 

Eckert received his BS in computer science 
from McGill University, Canada, in 1981 and 
his MMath degree from the University of Wa¬ 
terloo in 1983. 

The authors can be contacted at Bell-North¬ 
ern Research, Computing Research Laboratory, 
PO Box 3511, Station C, Ottawa, Ontario, 
Canada K1Y 4H7. 


COMPUTER 







































CALL FOR PAPERS 

Fourth 

Software Engineering Standards 
Application Workshop 


Theme 

May 21-23,1991 


San Diego, California 

To assess the application of Software Engineering Standards 

Sponsor: 

and to project the course of Software Engineering Standards 


for the next decade. 

IEEE COMPUTER SOCIETY 


Of Standards: 


TOPICS OF INTEREST 

ASSESSMENT 

. Historical Perspective 

Are We Standardizing The "Right Thing" ? 

What Has Been The Effect Of Standards 
On The Software Engineering (S.E.) Process 
Are Standards Leading Or Trailing Current E 
What Is The Process Of Producing Standards: 

. Status 

How Do We Integrate Standards 
What Are The Liability Implica 
Where/How Do I Apply Standards? 

What Standards Are In Development? 

- IEEE - JTC-1 - DoD 

- X3 - NASA - iso ; 

- ASQC - NIST 

How Do You Manage The Cost Of Implementing Standards? 
What Are The Sucess Stories? 

Can We Apply Standards After The Fact? 

Whose Standards Do We Follow? 

FUTURE DIRECTIONS 

Are Standards Necessary? 

What Is The Impact Of EC92 On Standards 

What Is' Software Process/Product Certification? 

Are Research Findings Being Incorporated Into Standari 
What Are Software Warranties? 

How Do Standards Affect The S.E. Process? 

What Standards Should I Follow? 


PAPER SUBMISSION 

Four (4) copies of the manuscript or 
*panel proposal should be submitted 
to: David Card 

Computer Sciences Corp. 

4061 Powder Mill Road 
Calverton, MD 20705 
Extended Abstract 

(1000 words) Due: 9/15/90 

Acceptance Notification: 11/1/90 
Camera Ready Paper Due: 1/2/91 

Presentation Slides Due: 1/30/91 

★The Chair for a panel must submit an 
abstract and issues statement. 

All panelists will be required to 
submit a position statement^by,1/2/91. 


COMMITTEE 

D. V. Edelstein, NYNEX - General Chair 

Program Committee 

D. Card, CSC - Program Chair 

K. de Jong, Sandia Nat. Lab. 

S. Mamone, NYNEX 

A. Nathman, CSC 

N. Schneidewind, Naval Postgrad. Sch 
M. Slovin, CSC 

L. Tripp, boeing 

























PRE-REGISTER FOR THE 
15th Annual Conference on 
Local Computer Networks 


September 30 - October 3, 1990 
Minneapolis, Minnesota USA 
Sponsored by the IEEE Computer Society 

Tutorials • Technical Paper Sessions • Expert Panel Discussions • Workshop-style Interchange 


Program Chairs: 

Marc Cohn 
Raychem Corporation 
300 Constitution Drive 
Menlo Park, CA 94025 USA 
(415) 361-3902 
(415) 361-6099-FAX 

General Chairs: 

Larry Green 
Protocol Engines Inc. 

1900 State Street 
Santa Barbara, CA 93101 USA 
(805) 965-0825 
(805) 687-2984-FAX 


Jim Mollenauer 
Artel Communications 
22 Kane Industrial Drive 
Hudson, MA 01749 USA 
(508) 562-2100 
(508) 562-6942-FAX 


Ken Thurbcr 
Architecture Tech. Corp. 

P.O. Box 24344 
Minneapolis, MN 55424 USA 
(612) 935-2035 
(612)-829-5871-FAX 


Program Includes: 

Sunday, September 30, 1990 
11:30 a.m.- 1:00 p.m. Registration 

1:00 p.m.- 5:00 p.m. Tutorial 1 

"Simple Network Management Protocol (SNMP)" 
Jeff Case, Univ. of TN and SNMP Research 

Monday, October 1, 1990 

7:30 a.m. - 9:00 a.m. Registration 

9:00 a.m. - 5:00 p.m. Tutorial 2 

"Intro, to Fiber Distributed Data Interface (FDDI)" 
Raj Jain, Digital Equipment Corporation 
9:00 a.m. - 5:00 p.m. Tutorial 3 

"Migration to OSI" 

Tony Pringle, Princomm Inc. 


, 


Registration Information: 

Pre-registration must be received on or before Monday September 
24, 1990-no exceptions. The fees for the tutorial on Sunday include 
tutorial notes and one break. The fees for the tutorials on October 1 
include luncheon, tutorial notes, and two refreshment breaks. The fees 
for the conference on October 2-3 includes a copy of the proceedings, 
four refreshment breaks, two luncheons, and the banquet. Make all 
checks payable to: 15th Conference on Local Computer Networks 

Hotel: 

Each participant is responsible for their own hotel reservations. The 
conference rate is $91 (single) and $101 (double) at: 

Radisson Plaza Hotel-Minneapolis 
35 South 7th Street, Minneapolis, MN 55402 USA 
(612) 339-4900, (612) 337-9766-FAX 

Please make hotel reservations by September 10 and reference: 

15 th Conference on Local Computer Networks 


Tuesday, October 2, 1990 


7:30 a.m. - 

9:00 a.m. 

Registration 

9:00 a.m. - 

10:15 a.m. 

Keynote Session 

"LAN Challenges of the 1990's" 

Gordon Bell, Stardent Computer 

10:30 a.m.- 

12:00 p.m. 

Technical Program 

12:00 p.m.- 

1:30 p.m. 

Lunch 

1:30 p.m. - 

5:00 p.m. 

Technical Program 

5:30 p.m. - 

6:30 p.m. 

Reception 

6:30 p.m. - 

9:00 p.m. 

Banquet 

Entertainment: Dudley Riggs Brave New 

Workshop and Instant Theatre Company 

Wednesday, 

October 3, 

1990 

8:30 a.m.- 

12:00 p.m. 

Technical Program 

12:00 p.m.- 

1:30 p.m. 

Lunch 

1:30 p.m. - 

5:00 p.m. 

Technical Program 


Registration Form: Registration Fees: 

send completed form and fee to: 

Conference 

Tutorial 1 

Tutorial 2 

Tutorial 3 

15th LCN Conference Students (include copy of 

Harvey A. Freeman 1990 paid fee statement) 

LANWORKS, Inc. 

$140_ 

$90_ 

$185_ 

$185_ 

5871 Cedar Lake Road IEEE Members 

St. Louis Park, MN 55*16 

$250_ 

$90_ 

$185_ 

$185._ 

(612) 591-5837 Non-Members 

$315_ 

$115_ 

$230_ 

$230_ 


Payable in U.S. Dollars 

Name ___ Registrations received after Sept. 24 will be 

returned. Late registrations at die conference will 
Organization ___ be at the following rates: 

Address & ZIP _._ Conference Tutorial 1 Tutorial 2& 3 

IEEE Member $300 $120 $225 

Phone ___ Non-Member $375 $140 $275 


REGISTER EARLY 












Tuesday, October 2 

10:30-12:00 

FDDI 

Chair: Floyd Ross, Unisys 


15th Conference on Local Computer Networks 
Preliminary Program 


10:30-12:00 Integrated Services Networks 
Chair: Jim Mollenauer, Artel 
"A Method for Dynamic Bandwidth Allocation in the 
FDDI-H MAN", S. Casale et al.-Univ. di Catania 
"Service Integration in FDDI", P. Martini et al.- 
Aachen Univ. of Technology 

"An Integrated Corporate Backbone Providing Packet and 
Fixed Bandwidth Services", J. Mollenauer-Artel 
"Evaluation of a 4 Mbps Twisted Pair Transmission 
System for an IVDLAN", Kazawa et al.-Hitachi 


10:30-12:00 Gigabit Networks 

Chair: Bob Grow, XLNT Designs Inc. 

"FDDI Follow-On Status", R. Grow, XDI 
"Configuration Control for Bus Networks", P. Heinzman 
et al.- IBM 

"Register InsertionJSelf Token Protocol for High Speed Ring 
LANs", K. Tanno et al.-Yamagata Univ. 

"Optical Switching and Routing Architectures for Fiber Optic 
Networks", A.Choudhary et al.-Syracuse Univ. 


1:30-3:00 
Gigabit Networks 
Chair: Bob Grow, XLNT 
Designs Inc. 


1:30-3:00 Networking Experience 
Chair: Harvey Freeman, LANWORKS, Inc. 

"LAN Traffic Analysis and Workstation Characterization", 
K. Khalil et al.- Bellcore 

"Experience with the XTP", R. Simoncic et al-Univ. of VA 
"Practical XNS Networking at the BLS", W. Adams- 
User Technology Assoc. 

"Experiences in Local Networking at USF", 

R. Sankar et al.-Univ. of So. FL 


10:30-12:00 Network Performance 

Chair: Marjory Johnson, RIACS 
"Performance Evaluation of FDDI and Interconnected 
Heterogeneous Networks", A. Nilsson et al.- NC State 
"LAN with Collision Avoidance: Switch Implementation and 
Simulation", T. Suda et al - UCI 
"Transient Performance of a CSMA System Under 
Temporary Overload ”, R. Hardy et al.- Simon Fraser Univ. 
"End-to-End Echo Response Time Analysis on Large 
Mainframe UNIX Systems", H. Mutlu et al.-AT&T 


3:30-5:00 3:30-5:00 Internetworking 

What the Future Holds Chair: Bill Seifert, Wellfleet 

for the LCN "Performance of Dual Backbone Rings in Interconnected 

Chair: Carl Cargill, DEC Token Rin 8 Networks", Doug Ha, Virginia Poly. Univ. 

"Route Determination in Multiple Ring Networks", 

R. Cohen et al - Technion HT 
"Internetworking Dissimilar LANs with FDDI", 

G. Schatzberg-RAD Network Devices 

"High Performance Internetworking Protocol", M. Ziterbart 

et al.-Univ. of Karlsruhe 

Wednesday, October 3 _ 

8:30-10:00 8:30-10:00 ATM 

Internetworking Chair: Bob Klessig, ATM 

Chair: Bill Seifert, Wellfleet "ATM Adaptation Layer Protocols and IEEE LAN 

Interconnection", G. Stassinopoulos et al.-Univ. of Athens 
"High-Speed Multimedia Backbone LAN Architecture Based 
on ATM Technology", Y. Takiyasu-Hitachi, et al. 

"Variable Bit Rate Video Transmission in the Broadband 
ISDN Environment", Y. Zhang-CONTEL, et al. 


3:30-5:00 LAN Technology I 
Chair: Pat Gonia, Honeywell 

"Implementation of a Secure Gateway on Hughes Aircraft's 
Engineering Design Network", P. Ho-Hughes Aircraft 
"IR Wireless System for ARCNet LAN", M. Betancor- 
Universad de Las Palmas, et al. 

"Experimental Analysis of Cache Memories for Interconnect 
Controllers”, T. Sheu et al.- IBM 
"Management of High Speed Networks with SNMP", J. 
Case, Univ. of TN and SNMP Research 

8:30-10:00 High Performance Protocols 

Chair: Dick Watson, LLNL 

"A High-Performance Implementation of OSI TP-4; 
Evaluation and Perspectives", C. Diot et al.-LGI 
"An Efficient Implementation of a High-Speed Protocol 
Without Data Copying", X. Zhang et al.-Univ. of 
Technology, Sydney 

"UltraNet: An Architecture for Gigabit Networking", 

B. Beach, Ultra 


10:30-12:00 10:30-12:00 FDDI 10:30-12:00 Military/Govt. Networks 

Protocol Efficiency and Chair: Floyd Ross, Unisys Chair: Steve Andersen, Unisys 

Performance "FDDI-The Interoperability Challenge", P. Hayden- DEC "SAFENET- A Navy Approach to Computer Networking”, 

Chair: Alf Weaver Univ. of VA "Using Redundancy in FDDI Networks", K. Ocheltree- IBM J. Paige-U.S. Navy 

"FDDI II Implementation Issues", B. Duckall- Apple "The Real-World of Networking, or 'Don't Forget the User 

"Workstations Catch the FDDI Ring", W. Walther- Unisys Interfaces'", G. Macheel et al., Rockwell 

"FOCON 34", P. Jathe-Deutsche System Technik 


1:30-3:00 

LCN and Standards- 
An Uneasy Future 

Chair. Carl Cargill, DEC 


3:30-5:00 LAN 
Technology II 
Chair: Ron Rutledge, Martin 
Marietta Energy Systems 
"Hidden Transmission: A 
Method to Improve Token Ring 

Performance", J. Zhu et al- 3:30-5:00 FDDI Implementations 
Lehigh Univ. Chair: Doug Hunt, Prime Computer 

"Dynamic Channel Assignment "FDDI Concentrator Design Issues", E. Hotard- 
in a Hybrid Indoor Radio/Wire Martin Marietta 

Network", H. Shigeno et al.- "FDDI Over Unshielded Twisted Pairs", S. Ginzburg 
Keio U. et al.-DEC 

"Telefax in LANs", B. Heinrichs "An FDDI Bridge for the Super Backbone LAN”, 
et al-Aachen Univ. of Techn. O. Takada et al - Hitachi Ltd. 

"Performance Evaluation of "Frame Content Idependent Stripping for Token Rings", 
FDDI”, P. Amer et al.- U. of DE K. Ramakrishnan et al., DEC 


1:30-3:00 Protocol Efficiency and 
Performance 

Chair: A. Weaver, Univ. of VA 

"Performance of the XTP in an Ethernet Environment", 

J. Chen et al.- Concordia Univ. 

"A High-Performance OSI Implementation on FDDI", 

P. Gonia et al.- Honeywell 

"Reliable Transfer of Data in a LAN with Multicast 
Distribution", A. Narayan, Planning Research Corp. 
"DCP: A Fully Distributed MAC Protocol Exploiting the 
Capabilities of Polling Systems", M. Conti et al., 

CNR Instituto CNUCE 


1:30-3:00 Distributed/Real-Time Networks 

Chair: Kami Prasad, Univ. of Lowell 

"Ada-Based Real-Time Network Environment”, P. Wang 

et al.-ESL 

"An Evaluation of IEEE 802 Protocols and FDDI in Real- 
Time Distributed Systems", A. Ghasem et al.-Univ. of WY 
"Unisys Data Transfer System", S. Andersen-Unisys 
"The Multidriver: A Reliable Multicast Service Using the 
XTP', B. Dempsey et al.-Univ. of VA 


3:30-5:00 Metropolitan Area Networks 

Chair: Jim Mollenauer, Artel 

"Cycle Compensation Protocol: A Completely Fair Protocol 
for the Unidirectional Twin Bus Architecture", D. Du et al.- 
Univ. of MN 

"A Fair-Featured MAN", S. Palazzo et al.-Univ. of Catania 
"An Evolutionary Approach to the Design of a MAN", 

R. Ramaswamy-UMKC, et al. 






STANDARDS 


Editor: Fletcher J. Buckley, 103 Wexford Dr., Cherry Hill, NJ 08003, phone (609) 866-6350, fax (609) 866-7753, Compmail+, f.buckley 


IEEE Project 802 standards efforts 

Ronald W. Gibson, Boeing Computer Services 


Having reached its 10th anniversary 
this year, IEEE Project 802 for Local 
Area Networks standards has achieved 
a high level of maturity along with a 
correspondingly high level of complex¬ 
ity. In this article, I discuss the back¬ 
ground of the 802 Project; the national 
and international environment within 
which the project operates for the devel¬ 
opment of standards; standards the IEEE 
has developed or is currently working 
on; and the expectations for Project 802 
in the future. 

Introduction and background. Proj¬ 
ect 802 has become a multifaceted or¬ 
ganization developing standards for 
LANs. To understand the possible future 
direction of the project committee, you 
must first understand how the project got 
to where it is today and the driving 
forces for change behind it. 

History of Project 802. The effort for 
developing LAN standards started in 


August 1979 under the auspices of the 
IEEE Computer Society’s Computer 
Standards Committee as part of its 
Microprocessor Standards Working 
Group when the committee submitted a 
project authorization request. 

After the work of the committee 
members became known, they were 
asked to reposition themselves under the 
IEEE Computer Society as a formal, in¬ 
dependent standards body. Tentative ap¬ 
proval of their work and the original 
PAR soon followed. 

On February 29, 1980, the first meet¬ 
ing of what was then known as the 
“IEEE Computer Society Local Network 
Standards Committee” was held, with 
about 80 persons in attendance. The at¬ 
tendees wanted to develop one IEEE 
standard for the interconnection of com¬ 
puters, terminals, printers, and file serv¬ 
ers in a local area. 

This LAN standard would then be sub¬ 
mitted through Accredited Standards 
Committee X3 Subcommittee S3, which 


operated as the US Technical Advisory 
Group to the International Organization 
for Standardization TC97/SC6 group for 
adoption of work as an ISO standard. 

The original objective of the IEEE 802 
committee was to develop a draft stan¬ 
dard for LANs in one year and have a 
formal standard in two. 

The committee then developed a func¬ 
tional requirements document that de¬ 
fined the scope, speed, and area coverage 
of LANs. The first definition of scope 
was for data rates of 1, 5, 10, and 20 
megabits per second, a channel access 
method similar to the Ethernet technique, 
topology of bus or tree networks (not 
rings), service categories for data trans¬ 
actions, frame structures, and address 
techniques. 

Original structure of effort. The goal 
of the committee’s effort was to use the 
lower layers of the ISO reference model 
for Open Systems Interconnection as a 
guide. The members then organized into 
three subgroups: 

(1) the Media Group — the physical 
level, 

(2) the Data Link Control/Medium 
Access Group, and 

(3) the Higher Layer Interfaces 
Group. 

The chair of each group was a member 
of the Project 802 Executive Committee. 
Liaison members were designated to co¬ 
ordinate the effort with other standards 
bodies such as those within the ASC, the 
International Electrotechnical Commis¬ 
sion, the Electronics Industry Associa¬ 
tion, and the ISO. The Project 802 com¬ 
mittee set about developing the one envi¬ 
sioned standard after defining project 
operating rules for voting rights and pro¬ 
cedures; implementing individual mem¬ 
berships, as opposed to corporate mem¬ 
berships (to insure high participation of 
users, not just vendors); and otherwise 
assuring fairness of operation. 

Further developments. In December of 
1980, the need for two media access 


| 802.10 SILS] 


802.1 Higher layer interfaces 


802.2 Logical link control 


802.10 Secure data exchange 


802.3 

802.4 

802.5 

802.6 

802.9 

access 

CSMA/ 

CD 


Token 

ring 

politan 

network 

Integrated 

Physical 


Figure 1. Relationship among IEEE 802 standards. 


COMPUTER 

















































methods became apparent, and the token 
access procedure was included among 
the two options. The European Computer 
Manufacturers Association TC24 com¬ 
mittee was established about the same 
time, and its members agreed to use the 
Project 802 committee’s work as the ba¬ 
sis for its work. A formal liaison was es¬ 
tablished that still exists today. 

By the next December, three access 
methods were in effect: carrier sense 
multiple access with collision detection, 
token bus, and token ring. The work of 
the Project 802 unit was then split into 
three standards efforts instead of one. 

This development was widely reported 
in the media as being the result of a com¬ 
mittee that “couldn’t make up its mind.” 
In reality, the 802 committee recognized 
that, just as different vehicles exist for 
transporting people and products, differ¬ 
ent media access methods were needed 
for different LAN applications. 

Patent issues were becoming apparent 
and contact was made with organizations 
and individuals regarding the use of pat¬ 
ents on a nondiscriminatory, minimal-fee 
basis. Several meetings and discussions 
were conducted on the patent of Olof 
Soderblom of Sweden, who had devel¬ 
oped a token ring technology for several 
banks using the token ring method he 
had patented. 

Some effort was spent deciding what 
legal and financial effect this would have 
on both the IEEE and Project 802. The 
work was coordinated with Soderblom’s 
firm, Willemijn Houdstermaatschappij 
BV, so that nonexclusive licenses for the 
patent would be available to everyone 
under reasonable and nondiscriminatory 
terms. The IEEE eventually decided that 
the companies implementing the token 
ring access method would have to work 
directly with Soderblom and his firm to 
determine the license and fee structure 
for any product developed using token 
rings. 

In August 1982, Project 802 was reor¬ 
ganized into a structure that divided the 
Media Group into working groups 802.3 
(CSMA/CD), 802.4 (Token Bus), and 
802.5 (Token Ring). The Higher Layer 
Interfaces group was designated 802.1, 
and the DLMAC group was reorganized 
with the medium access control portions 
assigned to WGs 802.3, 802.4, and 
802.5. 

The balance of the DLMAC group’s 
activity was given to a new group en¬ 
titled Logical Link Control and desig¬ 
nated 802.2. Later, the Metropolitan 
Area Network WG (802.6) was added, 
increasing the area of LAN coverage. 

Two TAGs were formed for broadband 
and fiber optic technologies, not to de¬ 
velop standards but to develop recom¬ 
mended practices or guidelines, serve as 


Table 1. Higher Layer Interfaces WG (802.1) efforts under way. 


Task/Effort 

PAR WG TCCC IEEE Std. 

Approved Approval Approval Approval 

ISO 

Status 

Overview and 






Architecture 

Yes 

Yes 

Yes 

Pending 


LAN/MAN Management 






(1) Management 

Yes 

In process 



(2) Load protocol 

Yes 

Yes 

Yes 

Pending 


(3) Layer management 





guidelines 

Yes 

In process 



FDDI Bridging 

Yes 

In process 



MAC Bridges 

Yes 

Yes 

Yes 

Pending 

DIS 

Remote Bridging 

Yes 

In process 



Glossary 

No 

In process 




Related acronyms 

ANSI 

American National Standards Institute 

ASC 

Accredited Standards Committee 

CCITT 

International Consultative Committee for Telephone and 

Telegraph 

COS 

Corporation for Open Systems 

CSMA/CD 

Carrier sense multiple access with collision detection 

DIS 

Draft international standard 

DLCMA 

Data link control/medium access 

DQDB 

Distributed queue, dual bus 

ECMA 

European Computer Manufacturers Association 

EIA 

Electronic Industries Association 

FDDI 

Fiber distributed data interface 

IEC 

International Electrotechnical Commission 

IRL 

Inter-repeater link 

ISDN 

Integrated Services Digital Network 

ISO 

International Organization for Standardization 

IVD 

Integrated voice/data 

JTC 

Joint technical committee 

LAN 

Local area network 

LLC 

Logical link control 

LSAP 

Link service access point 

MAC 

Media access control 

MAN 

Metropolitan area network 

NWI 

New work item 

OSI 

Open systems interconnection 

PAR 

Project authorization request 

PDAD 

Proposed draft technical report 

Phy 

Physical 

SC 

Subcommittee 

SDE 

Secure data exchange 

SILS 

Standard for Interoperable LAN Security 

SONET 

Synchronous optical network 

Std. 

Standard 

TAG 

Technical advisory group 

TCCC 

Technical Committee on Computers and Communication 

WG 

Working group 


August 1990 


85 









liaisons with other standards bodies, and 
assist the other WGs in the technical as¬ 
pects of the two media technologies. 

Still later, additions were made for the 
Integrated Voice/Data WG and the LAN 
Security WG (see Figure 1). 

National and international operat¬ 
ing framework. The Project 802 WGs 
for LANs are part of the IEEE Computer 
Society. Their work includes developing 
LAN standards for layers one and two of 
the ISO Reference Model for OSI. In the 
international standards arena, the focal 
point for LAN standards is ISO/IEC 
Joint Technical Committee No. 1, where 
the majority of the LAN work is carried 
out in subcommittee six. 

The members of the JTC1 are national 
bodies; for the US, that body is the 
American National Standards Institute. 
To provide technical input from the US, 
ANSI has established technical advisory 
groups for each subcommittee (and, in 
many cases, for the WGs as well). The 
US TAG for SC6 is ASC X3 subcom¬ 
mittee S3. It’s important to note that 
Table 3. CSMA/CD Working Group (802.3) efforts under way. concurrent efforts are going on in the in¬ 

ternational realm, with a great deal of 
coordination and feedback between the 
Project 802 committee and other stan¬ 
dards bodies. 

The Project 802 unit has informa¬ 
tional and liaison representation with 
many other standards bodies. Among 
these are the EIA, the IEC, the ECMA, 
the International Consultative Commit¬ 
tee for Telephone and Telegraph, and 
various X3 committees. In addition, liai¬ 
son has been set up with user and spe¬ 
cial-interest organizations such as the 
National Institute for Science and Tech¬ 
nology, the Corporation for Open Sys¬ 
tems, the OSI/Network Management Fo¬ 
rum, and the OSI Implementor’s Work¬ 
shop. 

The idea is to insure that the knowl¬ 
edge gained in developing standards or 
in the implementation of standards is 
shared to further enhance each 
organization’s work. 

Process for standards development. 

The process for developing a standard in 
the IEEE 802 arena starts long before it 
is submitted to the X3S3 committee. In 
the particular WG, working papers or 
proposals are submitted by individuals 
for consideration. These individuals are 
usually sponsored by both US and inter¬ 
national companies, but represent an in¬ 
dividual vote, not a company vote. 

Once a technology is deemed worthy 
of development for LANs, a PAR is sub¬ 
mitted to the IEEE Standards Board for 
approval so further work can be 
launched. After discussion and WG let¬ 
ter ballots, a draft proposed standard. 


Task/Effort 

PAR WG 

Approved Approval 

TCCC 

Approval 

IEEE Std. 
Approval 

ISO 

Status 

lBase5 

Yes 

Yes 

Yes 

Yes 

DP 

10Base2 

Yes 

Yes 

Yes 

Yes 

IS¬ 

10Base5 

Yes 

Yes 

Yes 

Yes 

IS* 

lOBaseT 

Yes 

Yes 

In progress 


lOBaseF (Fiber) 

Yes 

In progress 



10Broad36 

Yes 

Yes 

Yes 

Yes 

DIS 

Sublayer 

Management 

Yes 

Yes 

Yes 

Pending 

NWI 

Conformance 

Testing 

Yes 

Yes 

In progress 


802.3 Maintenance 

Yes 

In progress 



Fiber Optic 

LAN System 

Yes 

Mid 1990 




System Topology 

Yes 

Draft mid 1990 



Hub Management 

Yes 

Draft reviewed, 

July 1990 



Fiber Optic IRL 

Yes 

Yes 

Yes 

Yes 

DP 

* International Standard 8802-3 (1989) 






1 able 2. LLC Working Group (802.2) efforts under way. 



PAR 

WG 

TCCC 

IEEE Std. 

ISO 

Task/Effort 

Approved 

Approval 

Approval 

Approval 

Status 

LLC Type 1 

Yes 

Yes 

Yes 

Yes 

Std. 

LLC Type 2 

Yes 

Yes 

Yes 

Yes 

Std. 

LLC Type 3 

Yes 

Yes 

Yes 

Yes 

PDAD 

Conformance 

Testing 

Yes 

In progress 




Sublayer 

Management 

Yes 

Yes 




Security Label 
Option 

No 

In study 




802.2 Maintenance 

Yes 

In progress 





COMPUTER 














Table 4. Token Bus WG (802.4) efforts under way. 


Task/Effort 

PAR 

Approved 

WG 

Approval 

TCCC 

Approval 

IEEE Std. ISO 
Approval Status 

Token Bus 

Yes 

Yes 

Yes 

Yes DIS 

8802-4 Update* 

Yes 

Yes 

Yes 

Yes 

Conformance 

Testing 

Yes 

In process 



Redundant Media 

Yes 

In process 



Through-the-Air 

Media 

Yes 

Draft in preparation 


* Carrier band, MAC layer management, access control machine, MAC-Phy 
interface, and fiber optic bus medium. 


Table 5. Token Ring WG (802.5) efforts under way. 


Task/Effort 

PAR WG 

Approved Approval 

TCCC 

Approval 

IEEE Std. 
Approval 

ISO 

Status 

Station Management 
Revision 

Yes 

Yes 

Yes 

Std. 

PDAD-3 

Voice Grade Media 
(telephone twisted pair) 

Yes 

Yes 

Yes 

Pending 


Reconfiguring Dual 
Ring Station 

Yes 

Yes 

In process 



Multiring 
(Source Routing) 

Yes 

Yes 

Yes 

Pending 

PDAD-4 

Station Management 

Yes 

Yes 

Yes 

Std. 


16M bit-per-second 
Operation Support 

Yes 

Yes 

Yes 

Std. 

PDAD-1 

Early Token Release 

Yes 

Yes 

Yes 

Std. 


Fiber Optic Station 
Attachment 

Yes 

In process 




16 Mbps Unshielded 
Twisted-Pair Wire 

Mid 1990 

Draft in preparation 




guideline, or recommended practice is 
submitted to the 802 Executive Commit¬ 
tee for approval. 

Approval entails authority to submit 
the draft document for a 30-day Techni¬ 
cal Committee on Computer Communica¬ 
tions letter ballot. The TCCC letter ballot 
constitutes one more cycle in the effort to 
make sure a document gets wide review 
and comment before it goes to the IEEE 
Standards Board and into the JTC1 arena. 

These TCCC ballots must get a 75 per¬ 
cent return and a 75 percent approval rat¬ 
ing from the respondents before the re¬ 
sulting document can be sent to the IEEE 
Standards Board for approval. 

Standards efforts under develop¬ 
ment. The IEEE 802 standards efforts 
under development in the eight WGs and 
the two TAGs are shown in Tables 1-10. 

Project 802.1 is also the body the IEEE 
Standards Office designates to evaluate 
requests for link service access point ad¬ 
dresses. As such, it has a set of criteria 
you must meet to have an LSAP address 
assigned by the IEEE Standards Office 
for public protocols. 

The original 802.3 WG effort (see 
Table 3) was on the coaxial-based 10 
megabit-per-second operation. The origi¬ 
nal document has been published as the 
basic IEEE and ISO standard and covers 
both 200- and 500-meter segments. The 
designation of the subcommittees is first, 
data rate (1 or 10 Mbps); second, Base¬ 
band or Broadband; and third, distance 
coverage (for example, 10Base5 = 10 
megabits Baseband covering 500 meters). 

The subcommittees have also pub¬ 
lished an IEEE standard as a supplement 
to the basic standard. The supplement 
covers the repeater, fiber optic inter-re¬ 
peater link, 10Broad36, and lBase5 op¬ 
eration. They are also involved in 802.3 
layer management, conformance testing, 
and maintenance of the original standard 
for revisions and corrections as neces¬ 
sary. 

The 802.10 model in Table 8 shows 
the interrelation of secure data exchange, 
key management, and security manage¬ 
ment. The SDE protocol is being defined 
to provide the security services of confi¬ 
dentiality and connectionless integrity 
with the services to support the manage¬ 
ment of data origin authentication and 
access control. 

The SDE protocol will be placed be¬ 
low the LLC sublayer. The key manage¬ 
ment protocol will provide for the auto¬ 
mated distribution of the cryptographic 
keys to be used by the SDE. The security 
management aspect describes how the 
objects can be managed securely. 

The 802.7 TAG (see Table 9) works 
with 802.3, 802.4, video, and point-to- 
point applications of broadband technol¬ 


ogy. It also designates specifications for 
receivers, transmitters, and transmitter 
characteristics. 

The 802.8 TAG (Table 10) has a du¬ 


plex connector recommendation on the 
multimode optical duplex connector as 
defined by IEC SC86B USA 7 and ISO 
9314 Part 3 for keying details. Where a 


August 1990 


87 











Table 6. Metropolitan Area Network WG (802.6) efforts under way. 


Task/Effort 

PAR 

Approved 

WG 

Approval 

TCCC 

Approval 

IEEE Std. 
Approval 

ISO 

Status 

DQDB Std. 

Yes 

Yes 

In progress 



Multiport 

Bridging 

Yes 

In process 




Table 7. IVD LAN Interface WG (802.9) efforts under way. 

Task/Effort 

PAR 

Approved 

WG 

Approval 

TCCC 

Approval 

IEEE Std. 
Approval 

ISO 

Status 

IVD Standard 

Yes 

In progress 




Signaling 

Requests 

N/A 

In progress 





Other issues pending: 


(1) Decision on line codes for higher speeds. 

(2) Q93X signaling in IVD LANs. 

(3) The MAC to layer management entity interface. 

(4) LAN ISDN interworking. 


Table 8. Standard for Interoperable LAN Security (802.10 — SILS) efforts un¬ 
der way. 


Task/Effort 

PAR 

Approved 

WG 

Approval 

TCCC 

Approval 

IEEE Std. ISO 
Approval Status 

Security Model 

Yes 

In process 



Secure Data 





Exchange 

Yes 

Mar. 1990 



Key Management 

Yes 

In process 



Security 





Management 

Yes 

In process 




Table 9. Broadband TAG (802.7) efforts under way. 


Task/Effort 

PAR 

Approved 

WG 

Approval 

TCCC 

Approval 

IEEE Std. ISO 
Approval Status 

Recommended 





Practices 

Yes 

Yes 

Yes 

Yes 

Status Monitoring 

No 





single mode optical duplex connector is 
required, the WG recommends adopting 
the fiber distributed data interface single 
mode duplex connector. These are for 
802 defined optical interfaces. 

Expectations for the future. On the 

whole, the IEEE Project 802 can proba¬ 
bly look forward to an increasing level 
of sophistication. As the application of 
the LAN technologies matures, expect 
more study groups and task forces to be 
established for new areas of interest. 

This will almost certainly involve 
some reshuffling of the market for the 
LAN standards being developed and re¬ 
fined by the Project 802 committee. I 
foresee several developments among the 
possible changes. 

The lBase5 standard was developed 
for low-speed applications on twisted¬ 
pair wires. Newer technologies such as 
the lOBaseT standard offering higher 
speeds on twisted-pair wires provide 
significant new capability. Expansion of 
the CSMA/CD standard for fiber optic 
cables will give long life to the CSMA/ 
CD technique. 

The 802.4 standard will be extended. 
This standard is well developed and is 
being expanded for other media, trans¬ 
mission technologies, and further ro¬ 
bustness. The 802.4-related confor¬ 
mance testing documents will be fully 
developed to ensure that products from 
the different vendors can be interoper¬ 
able. The use of fiber optics and/or 
“through the air” as a medium will cer¬ 
tainly extend the usefulness of this tech¬ 
nology. 

Revision is certain for the 802.5 stan¬ 
dard for alternate media such as fiber 
optic cable and unshielded twisted-pair 
wires. One change already implemented 
involves the early token release capabil¬ 
ity to enhance the efficiency of the to¬ 
ken ring media access method at higher 
speeds. The additions for redundancy 
and automatic reconfiguration in case of 
breaks in the ring will make this tech¬ 
nology more robust for applications 
other than just those for the office. 

The source routing protocol for bridg¬ 
ing token rings will also permit ease of 
expansion and accessibility on the net¬ 
work. The 802.5-related conformance 
testing documents will be fully devel¬ 
oped to ensure that products from differ¬ 
ent vendors can be interoperable. 

I also expect wide acceptance and re¬ 
finement of the market for 802.6. Since 
the synchronous optical network stan¬ 
dards are nearly complete, and the Inte¬ 
grated Services Digital Network is be¬ 
coming more mature and prevalent, the 
need for 802.6-based MANs will be¬ 
come more important. 802.6 forms the 
basis of the Bellcore switched multi- 


COMPUTER 















megabit data service metropolitan net¬ 
work service. 

We should also see wide acceptance of 
the IVD draft standard that is being de¬ 
veloped. This WG is developing a stan¬ 
dard that will be compatible with ISDN 
technologies. A draft standard is ex¬ 
pected by the end of 1990. With the im¬ 
plementation of the ISDN in many com¬ 
panies, an IVD LAN standard will be¬ 
come even more important. 

I also expect completion of the 802.10 
security data exchange protocol, refine¬ 
ment of the model, and a draft of the key 
management protocol. With the increas¬ 
ing need for secure data being transmit¬ 
ted across LANs for both the commercial 
and government sectors, this WG will 
certainly become more important. 

To be sure, Project 802 has come a 
long way from its beginnings 10 years 
ago. As the technology matures and bet¬ 
ter means to transmit with higher speeds 
over greater distances are devised, more 
standards or revisions to existing stan¬ 
dards will be developed. Project 802 is 
not at an end but is still evolving. We 
can expect the IEEE 802 LAN Standards 
Committee to be with us for some time 
to come. 


Table 10. Fiber Optic TAG (802.8) efforts under way. 



PAR 

WG 

TCCC 

IEEEStd. ISO 

Task/Effort 

Approved Approval 

Approval 

Approval Status 

62.5-micrometer Reference 





Fiber Recommendation 

Yes 

In progress (Draft B) 


Polymer Optical 

Fibers 

No 

In study 




About the author 

Ronald W. Gibson has been with Boeing since 1965 and has been involved in sys¬ 
tem engineering on a number of computer systems for configurations, installations, and 
communications techniques the past 14 years. His Technology Survey of Local Area 
Networks (1981), written as part of a communication-needs survey for the company, is 
among several articles related to computer system installation and networking tech¬ 
niques he has published. He has been a member of the Executive Committee of the 
IEEE 802 LAN Standards Committee the past five years and is a member of the work¬ 
ing group developing LAN security standards. 


nmm CALL FOR PAPERS $. 

The First International Workshop on Interoperability 
in Multidatabase Systems will be held in Kyoto (the 
beautiful old capital city of Japan) along with IEEE 
Conf. on Data Engineering. Original papers are soli¬ 
cited on topics including, but not limited to: 

Autonomy Resolving semantic heterogeneity 

Interdependence Federated & Multi database tools 
Data Definition New paradigms, object orientation 
Transactions Queries & updates, spec. & proc. 

Interoperability of DBMS & KBMS 
CHAIRPERSONS: Ahmed K. Elmagarmid, C.S. Dept., Pur¬ 
due U„ West Lafayette, IN 47907 USA, (317) 494-1998 & 
Yutaka Matsushita, Dept. Instrumentation, Keio U., Hiyoshi, 
Yokohama, Japan, 81-44-63-1141 ex3564 
SUBMIT 7 copies of an extended abstract of at most 5 
pages (troff/TeX accepted) to Marek Rusinkiewicz, Houston 
U.. C.S. Dept., Houston, TX 77204-3475, USA, (713) 749- 
4791, email: marek@cs.uh.edu or Yahiko Kambayashi 
Kyushu U., CS-CE Dept., Hakozaki, Fukuoka 812, Japan, 
email: yahiko@csce.kyushu-u.ac.jp, fax: 81-92-641-1825. 
PROGRAM COMMITTEE: Y. Breitbart (USA), B. Czejdo (USA), 
T. Furukawa (Japan), H. Garcia-Molina (USA), G. Gottlob (Ger¬ 
many), S. Heller (USA), S. Hikita (Japan), T. Kato (Japan), L. 
Kershberg (USA), W. Litwin (France, EUROPEAN COORDINA¬ 
TOR), Y. Masunaga (Japan), E. Neuhold (Germany), P. Scheur- 
mann (USA), G. Schlageter (Germany), A. Sheth (USA), H. Tirri 
(Finland), S. Uemura (Japan), R. Wang (USA). LOCAL 
ARRANGEMENTS: M. Yoshikawa, M. El-Sharkawi, M. Nagata, X. 
Zhong; PUBLICATION: A. Sheth; FINANCE: R. Martin; REGIS¬ 
TRATION: A. Datta; PUBLICITY: W. Perrizo. 


aa 


Submission deadline: September 15, 1990 

Revised Ext. Abstract: January 15, 1991 

Acceptance notices: December 10,1990 

Workshop dates: _ April 8-9, 1991 _ 


REPAIR PC's IN LESS TIME 
AT LESS COST, WITH 

LOGIMER 


If your business is maintenance and repair of 
personal computers you now can save time and money 
with the LOGIMER PC Analyzer. 



FEATURES: 

° More than 1000 Tests Within One Minute 
° Locates up to 70% of Real Breakdowns on Motherboard 
° Pin-points Location of Defective IC's 
° Especially useful in the case of a Computer's 
° Screen Blackout 
° 'h Size Card fits into any Slot in PC 

ORDER YOUR LOGIMER NOW! 

Visa and Mastercard Accepted 

iK_ 

TOTAL POWER INTERNATIONAL, INC. 

418 Bridge Street, Lowell, MA 01850 Tel. (508) 453-7272 

Fax: (508) 453-7395 


Reader Service Number 7 























10th IEEE SYMPOSIUM on 



ARITH 10 

JUNE 26-28, 1991 
Grenoble, 
France 


Sponsored by the Technical Committee on VLSI (TC-VLSI), IEEE Computer Society 
in cooperation with the 

Centre National de la Recherche Scientifique (CNRS) 
and 

Institut dTnformatique et de Mathematique Appliquees de Grenoble (IMAG) 


CALL FOR PAPERS 


Authors are invited to submit papers describing recent advances on all aspects of computer 
arithmetic, including, but not restricted to the following topics: 

• Foundations of number systems and arithmetic. 

• Arithmetic algorithms and their analysis. 

• Processor design and implementation. 

• Highly parallel arithmetic units and systems. 

• New floating point chips, boards and systems. 

• Standards for number representation and arithmetic. 

• Impact of high level languages on arithmetic systems. 

• Special function implementation. 

• Reliability and testability. 

Four (4) copies of the complete paper should be submitted to one of the two Co-Program Chairmen 
no later than November 19, 1990. Authors will be notified of acceptance in February 1991, and 
final camera ready papers (up to 8 pages) will be due in March 1991. Conference proceedings 
will be available at the symposium. 


Co-Program Chairman 

(for the Americas and the Pacific) 

David W. Matula 

Dept, of Computer Science and Eng. 
Southern Methodist University 
Dallas, TX 75275 
USA 

Ph. (214) 692-3080 
Fax (214) 692-4138 
Email: matula@csvax.seas.smu.edu 


Co-Program Chairman 
(for Europe, Africa and Asia) 

Peter Komerup 

Dept, of Math, and Computer Science 
Odense University 
DK-5230 Odense 
Denmark 

Ph. +45 66 15 86 00 
Fax +45 65 93 26 91 
Email: komerup@imada.dk 


IEEE COMPUTER SOCIETY 


THE INSTITUTE OF ELECTRICAL 
AND ELECTRONICS ENGINEERS, INC. 












C 7<SNEWS 


Editor: Guylaine M. Pollock, Sandia National Laboratories, Division 1424, PO Box 5800, Albuquerque, NM 87185; phone (505) 846-0040; Internet, gmpollo@sandia.gov 


Computer Society recognizes special achievements and service 


ence as arrangements and exhibits chair 


Numerous awards have been presented 
to IEEE Computer Society members to 
recognize special achievements and 
dedicated service in a wide range of ar- 

Prathima Agrawal received the soci¬ 
ety’s Distinguished Service Award for 
“outstanding service to the IEEE Interna¬ 
tional Conference on Computer Design 
as program and general chair in 1987 and 
1988, respectively.” 

An Outstanding Contribution Award 
was presented to Carl G. Davis for “estab¬ 
lishing, organizing, and chairing the first 
Software for Strategic Systems Confer¬ 
ence for the Computer Society’s 
Huntsville, Alabama, chapter.” 

Meritorious Service Awards were pre¬ 
sented to 

• Lorraine M. Duvall, for “technical 
and administrative leadership while serv¬ 
ing as chair of the Technical Committee 
on Software Engineering.” 

• Karen Friedman, for “years of dedi¬ 
cation and support of the mass storage 
systems symposia and producing the Di¬ 
gest of Papers." 

• Anthony F. Hutchings, for “excellent 
support to the IEEE International Confer¬ 
ence on Computer Design by chairing its 
CAD track.” 

• Walter H. Kohler, for “technical and 
administrative leadership while serving 
as chair of the Technical Committee on 
Distributed Processing.” 

• Carl Landwher, for “technical and 
administrative leadership while serving 
as chair of the Technical Committee on 
Security and Privacy.” 

• Vladimir Nejezchleb, for “years of 
dedication as treasurer and member of the 
Technical Committee on Mass Storage 
Systems’ Executive Committee.” 

• Kathy Sills, for “years of dedication 
and excellence in supporting seven mass 
storage systems symposia.” 

• Richard C. Smith, for “dedication 
and contributions to the ACM/IEEE De¬ 
sign Automation Conference over the 
past eight years and for serving as general 
chair in 1990.” 

• Dennis A. Yinger, for “dedication 
and continuous contributions to the 
ACM/IEEE Design Automation Confer¬ 


over the past six years.” 

Certificates of Appreciation were 
awarded to 

• Gordon Adshead and Mariagiovanna 
Sammi, for “contributions as members of 
the editorial board of IEEE Design & Test 
magazine.” 

• Terry Bowen, R. Allen Cobb, Ronald 
W. Gibson, Kimberly Kirpatrick, James 
Mollenauer, John E. Montague, trad 
Everett Rigsbee III, for “contributions to 
the development of the IEEE 802 Stan¬ 
dards.” 

• Robert Cannon, O.E. Katter, Susan 
Merritt, and William Mitchell, for “sig¬ 
nificant service to the profession in the 
CSAB accreditation process.” 

• Richard S. Frary, for “participating 
in the programs of the Committee on Pub¬ 
lic Policy and being the Computer Soci¬ 
ety’s 1988-89 representative to the 


The IEEE Computer Society’s Techni¬ 
cal Committee on Very Large Scale Inte¬ 
gration has several ongoing activities 
designed to benefit professionals with 
common interests. The TC focuses on the 
interaction between the semiconductor 
process and system design on VLSI. Em¬ 
phasis is on integrating the design, fabri¬ 
cation, application, and business aspects 
of VLSI from both hardware and software 
viewpoints. 

Special issues of IEEE journal. 

Microelectronic systems must be speci¬ 
fied at every level of abstraction, includ¬ 
ing system, behavioral, register, logic, 
circuit, transistor, and process levels. To 
address this growing area, a series of spe¬ 
cial issues is planned for the IEEE Jour¬ 
nal of Solid-State Circuits as a forum for 
original papers emphasizing the interac¬ 
tion of various aspects of microelectronic 
systems, including system design, logic 


AFIPS Government Relations Commit- 

• Manfred Hein and Sharon Trauth, for 
“contributions leading to the successful 
revision of ANSI/IEEE Standard 730.1- 
1989, IEEE Standard for Software Qual¬ 
ity Assurance Plans.” 

• Belinda Housewright, for “logistical 
support of the IEEE 9th and 10th mass 
storage systems symposia.” 

• Timothy J. Kriewall, for “producing 
an outstanding program as general chair 
of the 1989 IEEE Symposium on Com¬ 
puter-Based Medical Systems.” 

• Gerald Peterson, for “significant ser¬ 
vice to the profession in the ABET ac¬ 
creditation process.” 

• Margaret Peterson, for “distin¬ 
guished service as editor of the CompMed 
TC Newsletter." 

• Nathan Tobol, for “years of dedi¬ 
cated service to and guidance of the IEEE 
802 Standards.” 


circuit design, memory design, architec¬ 
ture, CAD tools, testing, physical design, 
VLSI technology, and semiconductor 
processes. 

The deadline for papers is September 
15, 1990. Prospective authors should 
check Call for Papers in this issue of Com¬ 
puter for information on submissions. 

VLSI technical bulletin. The TC’s 
quarterly newsletter, sent free to all TC 
members, contains refereed technical ar¬ 
ticles as well as announcements about 
conferences and workshops of interest to 
the VLSI community. The editor-in- 
chief is Amar Mukherjee of the Univer¬ 
sity of Central Florida, and the managing 
editor is Sunil Das of the University of 
Ottawa. 

Annual workshop. The TC sponsors 
an annual workshop to bring together 
leading individuals with a variety of 


Publications, workshops, roundtable among 
activities of VLSI TC 

Donald W. Bouldin, Past Chair, Technical Committee on VLSI 


August 1990 


91 






interests. Topics at the 1990 workshop 
included wafer-scale integration, design 
with programmable technologies, asyn¬ 
chronous design for digital systems, de¬ 
sign for testability, CAD frameworks, 
synthesis and verification, and interac¬ 
tive manufacturing techniques. 

The 1991 IEEE Computer Society 
Workshop on VLSI will be held February 
10-13 in Orlando, Florida. Paul Cohen of 
the Massachusetts Microelectronics 
Center will be the general chair. To par¬ 
ticipate in the technical program, contact 
Len Berman, Program Chair, IBM Wat¬ 
son Research Center, PO Box 218, York- 
town Heights, NY 10598, phone (914) 
945-1213, fax (914) 945-2141; e-mail, 
berman@ibm.com. 

Model VLSI curricula. The IEEE 
Computer Society and the ACM have de¬ 
veloped model curricula for several aca¬ 
demic programs and are about to update 
their recommendations. A curriculum 
committee within the TC on VLSI will 
collect information from universities 
teaching VLSI-related courses and dis¬ 
tribute this information to other educa¬ 
tors. 

Also, a report containing recom¬ 
mended or model VLSI courses is being 
written to guide other universities in set¬ 
ting up similar courses. For additional in¬ 
formation, contact Donald Bouldin, 
Electrical and Computer Engineering, 
University of Tennessee, Knoxville, TN 
37996-2100, phone (615) 974-5444, fax 
(615) 974-5492; e-mail, 
bouldin@sunl.engr.utk.edu. 

Roundtable on VLSI education. 

Courses in VLSI cover many topics but 
generally traverse multiple levels of ab¬ 
straction. In project-oriented courses, 
students experience the complete design 
cycle: design, simulation, implementa¬ 
tion, and test. But are these courses meet¬ 
ing the short- and long-term needs of in¬ 
dustry? 

To answer this question, the TC organ¬ 
ized a roundtable discussion held July 21, 
1989, in cooperation with the second an¬ 
nual VLSI Education Conference and 
Exhibition. The participants, represent¬ 
ing industry and academia, included Hal 
Carter of the University of Cincinnati and 
the National Science Foundation, Paul 
Cohen of the Massachusetts Microelec¬ 
tronics Center and the University of Low¬ 
ell, Randy Geiger of Texas A&M Univer¬ 
sity, Nancy Klimavicz of IBM, William 
McAllister of Hewlett-Packard, Jim 
Rowson of VLSI Technology, and 
Donald Bouldin of the University of Ten¬ 
nessee. An edited transcript of the 
roundtable appears in the June 1990 issue 
of IEEE Design & Test. 


Nominations sought in society’s 
awards program 


The IEEE Computer Society is seeking 
nominations to recognize service contri¬ 
butions. The key society service awards 
are the 

• Richard E. Merwin Distinguished 
Service Award — A certificate and 
$1,000 for outstanding service to the pro¬ 
fession at large, including significant ser¬ 
vice to the Computer Society or its prede¬ 
cessor organizations. This is the highest 
level service award the society bestows. 

• Harry Hayman Distinguished Ser¬ 
vice Award — A certificate and $2,000 
for long and distinguished service of an 
exemplary nature in the performance of 
duties over and above those called for as a 
regular employee of the society. This is 
the highest service award given to an ac¬ 
tive society staff member. 

• Distinguished Service Certificate — 
A certificate for long and distinguished 
service to the society at a level of dedica¬ 
tion and achievement rarely demon¬ 
strated. 

• Outstanding Contribution Certifi¬ 
cate — A certificate for an achievement 
of major value and significance to the so¬ 
ciety. The achievement should be a spe¬ 
cific, concisely characterized accom¬ 
plishment, as opposed to a collection of 
different efforts. There is no minimum 
service time; at least two seconding 


nominations are required. 

• Meritorious Service Certificate — A 
certificate for meritorious and signifi¬ 
cant service to any Computer Society- 
sponsored activity. Qualification is en¬ 
hanced by the level and number of contri¬ 
butions, excellence, dedication, and ten¬ 
ure of service. Four years’ service and at 
least one second are required. 

• Certificate of Appreciation — A cer¬ 
tificate for creditable service to any Com¬ 
puter Society-sponsored activity. Gener¬ 
ally requires one year of service and no 
second. 

The top three, the distinguished ser¬ 
vice awards, require four years of service 
and at least three seconds. 

This year the Board of Governors ap¬ 
proved two new awards: the Young Com¬ 
puter Scientist and Engineer Award and a 
Doctoral Dissertation Award. 

Those wishing to make nominations 
should submit the nominee’s name, af¬ 
filiation, address, and phone number, 
along with the particular award recom¬ 
mended and a suggested five- to 20-word 
citation, by September 15, 1990, to Jo¬ 
seph E. Urban, Arizona State University, 
College of Engineering and Applied Sci¬ 
ences, Dept, of Computer Science and 
Engineering, Tempe, AZ 85287-5406; 
e-mail: jurban@asuvax.eas.asu.edu. 


Perot, Tinker honored by Computerworld/ 
Smithsonian Awards 


H. Ross Perot and Robert Tinker 
headed the list of awardees at the 1990 
Computerworld/ Smithsonian Awards 
banquet in Washington, DC, June 25. The 
program, supported by some 40 corpo¬ 
rate sponsors, including the IEEE Com¬ 
puter Society, honored 11 recipients out 
of 219 nominees. Nine industry awards 
were made along with the special bene¬ 
factor awards to Perot and Tinker that rec¬ 
ognize career-long achievement rather 
than an individual project. 

According to sponsors, the program, 
subtitled “A Search for New Heroes,” 
was created to “search out and publicly 
honor those men and women who are us¬ 
ing information technology, across a 
spectrum of industries, to make our 
planet a more human, healthy, and coop¬ 
erative place to live.” 


Organized by Computerworld and the 
Smithsonian Institution National Mu¬ 
seum of American History, the program 
was established in 1989 and presented the 
first awards in 1990. 

Lifetime achievement. The Price Wa¬ 
terhouse Lifetime Achievement Infor¬ 
mation Technology Award was pre¬ 
sented to Perot for his role as a “ground¬ 
breaking businessman in the information 
technology industry.” In the early 1960s 
Perot observed that many companies had 
computing needs that were not being met 
because they couldn’t afford their own 
mainframes, while other companies had 
mainframes that were not being used to 
capacity. A pioneer in computer 
timesharing, Perot started EDS to pro¬ 
vide full-scale computer services to large 


92 


COMPUTER 









companies or government agencies. To¬ 
day EDS is the largest computer service 
firm in the world. 

Advancement of science. The second 
benefactor award, the Siemens Award for 
the Advancement of Science, was pre¬ 
sented to Tinker, chief scientific officer 
of the Technical Research Centers. His 
award recognizes a life devoted to “the 
belief that for science education to thrive, 
it must empower students and teachers to 
solve meaningful, real-world problems.” 
This conviction led him to develop the 
Technical Education Research Center, 
with programs such as the National Geo¬ 
graphic Kids Network, the Star Schools 
program, Labnet, and the Global Lab. 
Tinker’s commitment to mobilize the lat¬ 
est technology and apply it to the class¬ 
room has shown that collaborative efforts 
between schools, government, and busi¬ 
ness can result in improved and engaging 
mathematics and science education. 

Industry awards. The nine industry 
awards, by category, were 

• Business and related services — 
Berkeley Systems, for the Outspoken 
program, a simple system for navigating 
through software structures, enabling the 
blind to use the next generation of com¬ 
puters, including graphical user inter¬ 
faces. 

• Education and academia — the Jason 
Foundation for Education, for the Jason 
Project, which uses multiple technolo¬ 
gies to extend live video images of scien¬ 
tific and historical projects into class¬ 
rooms and museums. 

• Environment, energy, and agricul¬ 
ture — Environmental Systems Research 
Institute, for development of ARC/Info, 
an advanced geographic information pro¬ 
gram that makes it possible to combine 
and manipulate geographical data from 
maps and numerical information. 

• Finance, insurance, and real estate — 
Swiss Options and Financial Futures Ex¬ 
change, for development and implemen¬ 
tation of the world’s first totally elec¬ 
tronic exchange and clearinghouse di¬ 
rectly linked to its members. 

• Government and nonprofit organiza¬ 
tions — Ministry of the Interior of the 
Government of Thailand, for the world’s 
only fully integrated population demo¬ 
graphics system. 

• Manufacturing — the Lubrizol 
Corp., for successfully implementing an 
advanced technical system that uses arti¬ 
ficial intelligence to generate and distrib¬ 
ute to employees and customers material 
safety data sheets containing essential 
health and safety information. 

• Media, arts, and entertainment — 
Personics Corp., for a new music vending 


machine called Music Maker that permits 
users to make individualized tapes while 
guaranteeing royalties to musicians and 
reducing recording industry losses from 
unauthorized copying. 

• Medicine — Purdue University, for 
using high-powered supercomputer cal¬ 
culations to successfully solve the three- 
dimensional structure of animal viruses, 
making possible new progress in the de¬ 
velopment of vaccines for diseases as 
varied as the common cold and AIDS. 

• Transportation — Federal Express, 
for development and implementation of a 
comprehensive system using all types of 


information technologies to track ship¬ 
ments of every item throughout the deliv¬ 
ery cycle, from the hands of the originator 
to final destination. 

Computer Society represented. Rep¬ 
resenting the Computer Society at the 
awards presentation held in the nine¬ 
teenth-century Pension Building, were 
Oscar N. Garcia, past president; James H. 
Aylor, current vice president for press ac¬ 
tivities; and Barry Johnson, vice presi¬ 
dent for membership and information ac¬ 
tivities. T. Michael Elliott, executive di¬ 
rector, and Vi Doan, director of board and 
administrative services, also attended. 



Exciting challenges...ground-floor opportunity...state-of-the-art 
technology in an all-new, totally advanced manufacturing facility. 
Just some of the advantages this major name in innovative 
computer products can offer a take-charge entrepreneur like you. 
Based at our prime Southeast location, with its attractive cosmo¬ 
politan lifestyle, you’ll lead an autonomous engineering team 
working with dynamic advances in UNIX*, PC technology, Ad¬ 
vanced Human Interface, Al. You will direct the development of 
new products, formulate strategies, research the latest computer 
breakthroughs, and select/train a staff of professional managers. 
You must be a forceful, results-driven leader with 10 years in a 
state-of-the-art development environment. An MSEE/CS degree is 
required (MBA preferred) combined with proven successful R&D 
project management for the commercial marketplace. 

In addition to the prestige of working with an industry leader, you'll 
receive an excellent salary and exceptional benefits package— 
with all the advancement potential you can ask for! If you’d like to 
be on the cutting edge of some of today’s most advanced systems 
and technology, send your resume with salary history to: 

P.O. Box 3014, IEEE Computer Society 
Attn: Marian Tibayan, Dept. 485 
Los Alamitos, CA 90720-1264 

We are an equal opportunity employer. 

'UNIX is a registered trademark of AT&T 


August 1990 


93 






NEW PRODUCTS 


Contact or send releases to Nancy Hays, Computer, 10662 Los Vaqueros Circle, Los Alamitos, CA 90720; Compmail+, n.hays 


Signal Analytics brings color image processing to Mac II 


Signal Analytics has announced 24-bit 
color image processing software for sci¬ 
entific users of the Apple Macintosh II 
computer line. The program, IPLab/ 
Spectrum, is based on IPLab, the 
company’s scientific gray-scale image 
processing software (see pp. 103-107 in 
this issue for a review of IPLab). 

IPLab/Spectrum displays and pro¬ 
cesses images in full color with up to 8 
bits allocated to each of the red, green, 
and blue components. It also supports the 
use of several other color coordinate sys¬ 
tems, including YIQ, HSV, and CMYK. 


Users can decompose a full-color im¬ 
age into one of these color component 
sets for processing, then merge the im¬ 
ages to recreate a color image. Accord¬ 
ing to the company, IPLab/Spectrum 
provides such fundamental image pro¬ 
cessing operations as point functions, 
fast linear filtering, morphology, fast 
Fourier and cosine transforms, automatic 
and manual measurements, and binary 
operations — adding, subtracting, divid¬ 
ing, and multiplying two images. All 
these operations can be applied to user- 
defined regions of interest, and statistical 


analyses can be obtained on arbitrary po¬ 
lygonal regions. 

IPLab/Spectrum can read data in a va¬ 
riety of image formats. Image files can 
be read and written in the standard PICT 
and TIFF formats. The program can also 
import most noncompressed image files 
from other computers with color planes 
interleaved or sequential. 

The software costs $749. Owners of 
IPLab can upgrade to IPLab/Spectrum 
for $275. 

Reader Service 30 


Wavetracer focuses on study of physical phenomena 


Wavetracer is introducing a set of 
software and hardware tools to produce 
a problem-solving environment aimed 
at helping scientists and engineers study 
complex physical phenomena. The firm 
calls this tool set the industry’s first 
true 3D, data-parallel computing envi¬ 
ronment. 

The products introduced include the 
Data Transport Computer, the Multi C 
programming language, and a set of pre¬ 
programmed physical phenomena solu¬ 
tion tools. 

The Data Transport Computer is a 
three-dimensional, massively parallel 
computer designed for highly efficient 
data transporting, processing, and visu¬ 
alization for multidimensional problems 
and other computationally intensive ap¬ 
plications. It attaches to most industry- 
standard workstations. Pricing begins at 
approximately $98,000. 

The Multi C programming language 
was created for developing multidimen¬ 
sional, data-parallel software applica¬ 
tions. It has reportedly been designed to 
simplify development of solutions to us¬ 
ers’ problems rather than to fit software 
to a computer architecture. According to 
the company, Multi C interfaces with 
other programming languages to simplify 
migration of application programs writ¬ 
ten in C and Fortran, for example. 

The preprogrammed physical phenom¬ 
ena solution tools provide mathematical 


solvers that reportedly eliminate user 
programming of complex, multidimen¬ 
sional algorithms. The first of these 
tools, EM Wavetracer, can be used to 
solve electromagnetic scattering prob¬ 
lems in antenna design, design of low- 


observable vehicles, materials, radar de¬ 
sign, and sensor design. 

Data Transport: Reader Service 31 
Multi C: Reader Service 32 
Tools: Reader Service 33 



Wavetracer’s Data Transport Computer is part of the company’s environment 
for solving complex, multidimensional problems in physical phenomena, mathe¬ 
matics, and visualization. 


94 


COMPUTER 






















Voice Com Systems targets 
large sales forces 

Voice Com Systems has released two 
interactive voice response products aimed 
at companies with large, geographically 
dispersed sales forces. Sales Activity 
Management (SAM) and Order-By- 
Phone are designed to help companies 
reduce the cost of sales administration, 
order entry, and fulfillment by providing 
quick and easy access to information 
from the field. 

With SAM, sales people can report 
sales call activity via 800 numbers from 
any touch-tone phone, 24 hours a day, 
using the telephone keypad to identify 
themselves, enter sales call dates and lo¬ 
cations, product sold, and shelf position 
and sales display status. 

Order-By-Phone allows callers to order 
products over the phone using the key¬ 
pad. It targets companies with field sales 
people and distributors who do frequent, 
high-volume ordering and have short 
sales cycle and fulfillment requirements. 

SAM: Reader Service 34 
Order-By-Phone: Reader Service 35 


Apple launches Personal 
LaserWriter line 

Two new printers make up Apple 
Computer’s Personal LaserWriter line, 
which combines a Motorola 68000 pro¬ 
cessor and the Canon LBP-LX laser en¬ 
gine to produce 300-dpi text, graphics, 
and scanned images. 

The Personal LaserWriter SC is a 
single-user printer for basic text and 
graphics. It includes 1 Mbyte of RAM, 
the amount required for imaging a full 
page of text and graphics. Times, Helvet¬ 
ica, Courier, and Symbol typefaces in 
9- through 24-point sizes are included on 
floppy disks. A Small Computer Systems 
Interface (SCSI) provides the flexibility 
for daisy-chaining up to six additional 
peripheral devices. 

The Personal LaserWriter NT is a 
multiuser printer for individuals and 
small work groups with more advanced 
graphics needs. It includes Appletalk, the 
built-in network capability, and the 
Adobe Postscript page description lan¬ 
guage. An RS-232 serial interface pro¬ 
vides connectivity for non-Apple com¬ 
puters. The printer has 2 Mbytes of 



Apple’s Personal LaserWriter features 
a desktop design, two paper input 
trays, and compatibility with Macin¬ 
tosh applications. 


RAM, upgradable to 8 Mbytes, and in¬ 
cludes 35 standard typefaces. 

Pricing is $1,999 for the Personal Las¬ 
erWriter SC and $3,299 for the Personal 
LaserWriter NT. The SC model can be 
upgraded to NT capability by replacing 
the controller board. 

Reader Service 36 


Tektronix workstation handles visualization applications 


Tektronix says it designed its XD88/ 

35 visualization workstation to provide 
the overall system performance needed 
for high-end visualization applications. 
Computing performance is 21 MIPS and 
2.5 Mflops. The workstation includes 
Core Tekimaging, the company’s 2D im¬ 
age processing tool kit; and the 4G 
graphics accelerator, which reputedly re¬ 
draws more than 1 million 2D and 3D 
vectors per second. 

Tekimaging is an image processing 
software library for the XD88’s Motorola 


Alias Research has released Alias De¬ 
signer and Alias Studio, providing design¬ 
ers a choice of tools at two price levels. 

Alias Designer is a fully integrated 3D 
computer-aided industrial design system 
tailored for smaller design departments 
and independent design consultants in 
need of a modeling and visualization 
package. It enables designers to create 
new product designs and product styling 
and to evaluate design and packaging 
ideas. Once a product design is chosen, 

3D data created in Alias Designer can be 


88000 RISC-based processors. It allows 
users to interact with real-world data sets 
from such sources as satellites and seis¬ 
mic sensors, and to display that abstract 
data as clear and understandable pic¬ 
tures. 

Core Tekimaging comes standard on 
the firm’s workstations to provide 2D 
image processing subroutines for image 
enhancement and analysis. Advanced 
Tekimaging allows customers to upgrade 
from 2D image processing to 3D image 
database reconstruction and perspective 


price ranges 

transferred to CAD systems for engineer¬ 
ing and manufacture. 

Alias Designer operates on the Per¬ 
sonal Iris workstation family from Sili¬ 
con Graphics. 

Alias Studio is an interactive modeling 
system based on nonuniform rational B- 
splines (NURBS), the standard geometry 
for CAD/CAM systems. Alias’ NURBS 
geometry enables transparent data trans¬ 
fer to CAD/CAM systems through stan¬ 
dard file formats such as IGES, VDA, 
DES, and DXF. The software includes 


scene generation. 

With the introduction of the XD88/35, 
the company announced that all its work¬ 
stations will include image processing 
capability as a standard feature. 

The XD88/35 has a base price of 
$31,995. A disk/tape configuration is 
available for an additional $5,000. Ad¬ 
vanced Tekimaging is an option for 
$14,000. 

XD88/35: Reader Service 37 

Tekimaging option: Reader Service 38 


Quickwire, a viewing mode for fast, 
interactive manipulation and viewing of 
very large CAD files, such as those typi¬ 
cal in architectural visualization. 

Alias Studio operates on the Iris fam¬ 
ily of workstations and the RS/6000 Sys¬ 
tem series of workstations from IBM. 

Prices for Alias Designer start at 
$12,000. Alias Studio prices begin at 
$25,000. 

Alias Designer: Reader Service 39 
Alias Studio: Reader Service 40 


Alias offers industrial design in two 


August 1990 


95 


13 








ADVERTISER INDEX 


PRODUCT INDEX 


Academic Press, Inc.127 

CACI Products Company.1 

Carnegie Mellon University .41 

Codex Corp. . 

COMPSAC ’91.Cover IV 

Computer Architecture Symposium.Cover III 

Computer Arithmetic Symposium .90 

Computer Workstations Conference .101 

DASFAA '91.. 

Formal Methods in VLSI Design Workshop.58 

Griffin Software Systems Technologies, Inc.56 

IEEE Computer Society Membership .123 

IEEE Computer Society Ombudsman .41 

IMS’91. . 

Intelligent Control Symposium.24-25 

Lawrence Livermore National Laboratory .72 

Local Computer Networks Conference .82-83 

Motorola Inc.. 

Multiple-Valued Logic Symposium.109 

Oakland Group, Inc.42 


Software Engineering Standards Application Workshop.81 

StarSys, Inc.. 

StructSoft, Inc.@5 

Sun Microsystems.. 

Supercomputing Conference.111 

Systems Integration Conference .102 

TASC.. 

Telesoft .Cover II 

Total Power International, Inc.89 

Visualization'90.99 


FOR DISPLAY ADVERTISING INFORMATION, CONTACT: 


Northern California and Pacific Northwest: John D. Vance & Associates, Inc., 
4030 Moorpark Ave., Suite 116, San Jose, CA 95117; (408) 741 -0354. 

Southern California and Mountain States: Richard C. Faust Co., 24050 Madison 
St., Suite 100, Torrance, CA 90505; (213) 373-9604. 

Midwest: The Kingwill Company, 4433 W. Touhy Ave., #540, Lincolnwood IL 
60646; (708) 675-5755. 

East Coast: Atlantic Representative Group, 349 Maple PI., Keyport, NJ 07735- 
(201)739-1444. 

New England: Arpin Associates, P.O. Box 6444, Holliston, MA 01746- 
(508) 429-8907. 

Europe: Heinz J. Gorgens, Parkstrasse 8a, D-4054 Nettetal 1 - Hinsbeck (F R G )• 
phone: (0 21 53) 8 99 88; telex 841 (17)2153310=HJG tlx d. 

Southwest/Southeast: Heidi Rex, 10662 Los Vaqueros Cir., PO Box 3014 Los 
Alamitos, CA 90720-1264; (714) 821-8380. 

For production information, conference, and classified advertising, contact Heidi Rex 
or Marian Tibayan. 

COMPUTER , 10662 Los Vaqueros Cir., PO Box 3014, Los Alamitos, CA 
90720-1264; phone (714) 821-8380; fax (714) 821-4010. 


RS# 

Academic Press, Inc. 8 

ACC Microelectronics 120 

Accutek Microcircuit 121 

Alias Research 39-40 

Apple Computer 36, 42 

C-Cube Microsystems 135 

CACI Products Company — 

Catalyst Semiconductor 122 

Cirrus Logic 123 

Data Translation 136 

Dell Computer 137 

Extrema Systems 138 

Fortron/Source 139 

FPS Computing 140 

Griffin Software Sys. Tech. Inc. 5 

GTFS 22 

IBM Corp. 41 

Information Machines 141 

Levco 142 

Micro Networks 124 

National Semiconductor 125 

National Technical Info. Service 23 

Newtech 126 

Oakland Group, Inc. 4 

Precision Monolithics 127-128 

Radstone Technology 143 

RISC International 129 

Siemens 130 

Signal Analytics 21,30 

Sony Microsystems 144 

StarSys, Inc. 3 

StructSoft, Inc. 6 

Sun Microsystems 2 

Tau Corp. 43 

Tektronix 37-38 

Telesoft 1 

Texas Instruments 145 

Total Power International, Inc. 7 

Veridata Research 146 

Vicom Systems 44 

Voice Com Systems 34-35 

Wavetracer 31-33 

White Technology 131 


PG# 

127 


95 
95, 97 
100 


98 

100 

100 

100 

100 

100 

56 

105 

97 
100 
100 

98 
98 

106 
98 
42 
98 

100 


94,105 
100 
7 
65 
4-5 
97 
95 

Cover II 
100 
89 
100 
97 


98 


96 


COMPUTER 









































































































' 


Editorial comments 

Hiked: _ 


PO Box is for reader-service cards only. 


PLACE 

POSTAGE 

HERE 


I disliked: 


I would like to see: 


For reader-service inquiries, see other side. 


COMPUTER 

Reader Service Inquiries 
PO Box 16508 

North Hollywood, CA 91615-6508 
USA 


ll.l.Il.ll. 


.11.1.1..II...I.I.II...I..I.I.I..I 


Editorial comments 

Hiked: _ 


PO Box is for reader-service cards only. 


PLACE 

POSTAGE 

HERE 


Idisliked: 


I would like to see: 


For reader-service inquiries, see other side. 


COMPUTER 

Reader Service Inquiries 
PO Box 16508 

North Hollywood, CA 91615-6508 
USA 


ll.l.II.II.....II.I.I..II...I.I.II...I..I.I.I..I 


Editorial comments 












PLACE 


Hiked: 





POSTAGE 




HERE 



PO Box is for reader-service cards only. 




Idisliked: 


I would like to see: 


For reader-service inquiries, see other side. 


COMPUTER 

Reader Service Inquiries 
PO Box 16508 

North Hollywood, CA 91615-6508 
USA 


ll.l.Il.ll.II.I.I..II...I.I.II...I..I.I.I..I 
































































IBM introduces computers for home use 


The PS/1 family from IBM comes in 
four models targeted specifically for 
home users. The units operate on Intel 
80286 microprocessors and offer either 
512 Kbytes or 1 Mbyte of system mem¬ 
ory. Users can choose a 1.44-Mbyte 
floppy disk drive or a combination of the 
floppy disk drive and a 30-Mbyte hard 
disk drive. A built-in 2,400-bps modem, 
a two-button mouse, and a 101-key key¬ 
board are also included. A color monitor 
is available. 

The computers come equipped with 
the DOS operating system and feature 
several on-line services for an introduc¬ 
tory trial period. These include Prodigy, 


With the announcement of the ISDN 
Developer Toolkit, Apple Computer is 
offering developers the basic connectiv¬ 
ity software and hardware necessary for 
accessing an Integrated Services Digital 
Network and its services from a Macin¬ 
tosh. Apple claims developers can use 
the toolkit to create ISDN applications 
for the Macintosh that maximize the inte¬ 
grated voice, data, and imaging capabili¬ 
ties that ISDN offers through a single 
high-speed transport system. The com¬ 
pany expects a variety of applications 
and services to be developed, including 
imaging and document database query, 
screen-based telephony, and resource 
sharing. 

The toolkit consists of an Apple ISDN 


Eagle is digital image acquisition, 
processing, and display software devel¬ 
oped by Tau Corporation and hosted on 
the Sun Sparcserver. Described more for¬ 
mally by the company as a data reduction 
video tracker for image sequence analy¬ 
sis, Eagle makes it possible to visually 
analyze such dynamic activities as traffic 
flow, destructive testing, particle move¬ 
ment, and motion studies. 

Functional Eagle attributes include 
automatic and real-time tracking of mul¬ 
tiple objects or multiple points per ob¬ 
ject, and analysis algorithms that auto¬ 
matically adapt to background condi¬ 
tions, reportedly allowing Eagle to locate 
and track leading edges and centroids 
even under difficult noise and clutter 
conditions. 


PS/1 User’s Club for on-line support, 
and Promenade, an education and enter¬ 
tainment service (available this fall). Mi¬ 
crosoft Works is also included. 

Reportedly designed with the novice 
in mind, the PS/1 guides users to the 
built-in software and on-line services 
with its first screen, which is divided into 
quadrants labeled “Information,” 
“Microsoft Works,” “Your Software,” 
and “IBM DOS.” 

Prices range from $999 to $1,999, de¬ 
pending on the memory, storage, and dis¬ 
play selected. 

Reader Service 41 


NB card, ISDN software, and an inte¬ 
grated voice/data manager. The IVD 
manager includes both the ISDN IVD 
tool, which controls telephone functions, 
and the ISDN serial connection tool, 
which controls data functions. 

Currently, the ISDN Developer Toolkit 
is compatible with two ISDN central of¬ 
fice switches — AT&T’s 5ESS (generic 
5E4.2 or later) and Northern Telecom’s 
DMS-100 (generic BCS-27 or later). 

Apple plans to support ISDN hardware 
and software in use in France as well as 
exploring support for various ISDN 
switches in North America, Europe, and 
Asia. 

Reader Service 42 


Eagle acquires image sequences by 
digitizing video or cine film sequences in 
volatile and nonvolatile memory. It dis¬ 
plays and digitally enhances individual 
frames, processes each frame to auto¬ 
matically extract the location of multiple 
features, and tracks these locations frame 
to frame. 

The software can be interfaced to any 
workstation, terminal, or PC that sup¬ 
ports the X Window server graphic-ter- 
minal commands and protocol. 

Pricing is $54,000 for the software 
only, $69,000 for the software and inte¬ 
gration with the customer’s hardware, 
and $ 129,000-plus for a custom turnkey 
system. 

Reader Service 43 


Vicom’s Master performs 
data visualization 

According to Vicom Systems, the 
Master data visualization server balan¬ 
ces the elements needed in the fusion of 
image processing, graphics, and displays. 
Data display is reputedly uncoupled 
from memory and data constraints, pro¬ 
viding complete programmability of for¬ 
mat and data type independent from the 
896 Mbytes of dynamic RAM image 
memory. 

The display system offers full-color, 
real-time display on three 1,600 x 1,280 
screens as well as an X Window capabil¬ 
ity with separate lookup tables for each 
window. A dedicated graphics engine 
provides simultaneous, independent 
graphics processing. The 160-Mbyte- 
per-second, 64-bit internal bus allows 
concurrent operation of the four on¬ 
board processors coupled to a VMEbus. 
One of the processors is a floating-point 
engine delivering 33 Mflops. 

The development environment con¬ 
tains a suite of C libraries, editors, de¬ 
buggers, compilers, and linkers. The 
system functions as a network node with 
industry-standard X.ll Windows, NFS, 
TCP/IP, and rlogin protocols supported 
over Ethernet, FDDI, and HPPI local 
area networks. 

Reader Service 44 



Vicom Systems’ Master data visualiza¬ 
tion server can be configured as a 
board set or a deskside tower, or it can 
be integrated into a Sparcstation 370. 


Apple demonstrates support for ISDN 


Eagle software tracks moving objects 


August 1990 


97 








1C Announcements 


Company, Model, Function Comments 


ACC Microelectronics 
ACC-2036 
AT controller 


Accutek Microcircuit 
Forty Plus Memory 
DRAM module 


Catalyst Semiconductor 

CAT28F010 

Flash memory 


Cirrus Logic 
CL-SH360 
Disk controller 

Micro Networks 
MN6774 
Sampling ADC 


National Semiconductor 

PAL10016C4-2 

PAL 


Newtech 

NT114Lpk 

Controller 


Precision Monolithics 
PMISpice, Release B 
SPICE library 


Precision Monolithics 

OP-160 

Op-amp 


RISC International 
Systems 

Soft Design Toolkit 
CAD software 


A single-chip controller for notebook-sized 80286 and 80386SX AT-compatible computers. Has 120 
seven DMA channels, three timer/counter channels, 14 external interrupt channels, data buffers, 
and address buffers. Features simultaneous EMS and shadow RAM. Supports clock rates up to 
25 MHz. Comes in a 208-pin package. Cost: $i00. 

A 1-Mbyte by 8-bit CMOS DRAM module for the Apple Macintosh Ilfx. Features data-in, data- 121 
out, and write-enable lines to separate pins on separate data buses. Operates in read-write and 
read-modify-write modes. Also available in a 4-Mbyte version. Comes in a 64-pin JEDEC-stan- 
dard configuration. Cost: $89. 

A 128-Kbyte by 8-bit CMOS flash eraseable and electrically reprogrammable memory with ac- 122 
cess times of 120, 150, and 200 ns. Features “speed programming,” which allows 4 bytes of the 
same data to be programmed simultaneously; TTL-compatible I/O; and 10-year data retention. 

Comes in a 32-pin DIP or PLCC. Cost (1,000s): $23 (200-ns, plastic DIP). 

An integrated PC XT or AT disk controller that implements on-the-fly Reed-Solomon code error 123 
correction. Reputedly allows a doubling of hard-disk-drive capacity. Comes in a 100-lead quad 
flat pack. Production scheduled for the end of 1990. Cost (100s): less than $24. 

A sampling A/D converter with a 100-kHz sampling rate and internal T/H amp. Based on the 124 
12-bit MN774. Available in eight versions (four performance grades and three levels of reliabil¬ 
ity screening). Packaged with necessary glue logic in a 28-pin, side-brazed DIP. Cost (100s): 
from $105 (commercial), $205 (stress screened), or $235.75 (military). 

A programmable-array-logic device with a max propagation delay time of 2 ns. A combinational 125 
device that uses no clocking signals. Accepts 16 input lines and generates four output functions, 
each with eight product terms. Comes in a 28-pin PLCC. Now in volume production. Cost 
(100s): $60.50 (PAL10016C4-2VC). 

Integrates an IBM PC XT-, AT-, RT-, or PS-2-compatible keyboard’s electronics with direct 126 
LED drive, LCD control, battery-level test input, and power management logic. Has sleep mode, 

5V operation, wake-up via keyboard or PC, and 16 optional outputs. Comes in a 64-pin flat 
pack. Cost: $4 and up for OEM quantities; from $35 for samples. 

A free library of SPICE macromodels, including instrumentation amp and low-noise amp SPICE 127 
models. Includes OP-177 and OP-77 op amp models and AMP-01 and AMP-02 models. Con¬ 
tains 54 model net lists for PMI ICs. Comes on a 5.25-inch disk. Call Sharon Duke at (408) 562- 
7470 for a complimentary copy. 

A high-speed, current-feedback operational amp with a typical slew rate of more than 1,300V/ps 128 
in unity-gain applications. Has a typical supply current of 6.5 mA per amp. Comes in an 8-pin 
ceramic or plastic DIP, with 20-pin LCC devices scheduled. Cost (100s): from $4.50 for ex¬ 
tended industrial grades. 

A computer-aided-design software tool for system and application-specific IC designers. In- 129 
eludes a library of Verilog hardware-description-language models for use with simulation and 
synthesis tools. Shipped as source code. Includes a model validation suite of programs written in 
Verilog. Models have Synopsys synthesis scripts. Cost: $65,000 plus $7,800 yearly for software 
maintenance. 


Siemens 
SAB 8352-5 
Microcontroller 


White Technology 

WS-512K8-120 

SRAM 


An 8-bit microcontroller in 12- and 16-MHz versions. Incorporates features of the Siemens SAB 130 
8051A and SAB 8052A/B chips. Contains 32 Kbytes of on-chip ROM. Has four 8-bit I/O ports, 
three 16-bit timer/event counters, a full-duplex serial channel with flexible baud rate, six inter¬ 
rupt vectors, a Boolean processor, and a serial interface. Comes in a 40-pin DIP or 44-pin 
PLCC. Cost (10,000s): $5.50 (PDIP 40); $5.75 (PLCC 44). 

A 4-Mbit (512 Kbits by 8) static RAM. Features 5 V operation. Available over commercial, in- 131 
dustrial, and military temperature ranges. CMOS- and TTL-compatible I/O. Access times from 
45-120 ns. Comes in a 32-pin, hermetically sealed, ceramic DIP with JEDEC-approved pinouts. 

Cost (100s): $600. 


COMPUTER 











Advance Announcement 


Visualization ’90 

October 23 - 26,1990 • Le Meridien Hotel • San Francisco, California 

The conference will explore how visualization is being used to extract knowledge from data. The confer¬ 
ence is concerned with all aspects of visualization, with a focus on interdisciplinary techniques. The confer¬ 
ence will allow a dialogue to occur between the developers of visualization methods and visualization users 
across the full spectrum of science, engineering and business. 

Keynote Speaker: Gordon Bell, Stardent Corporation 


Paper Sessions: 

Papers will be presented in the following areas: 

3D Systems and 3D Modeling 
Volume Visualization Algorithms and Techniques 
Visualization of Higher Dimensions 
Scalar and Multivariate Data Visualization 
Data Handling and Visual Representation 
Human Computer Interface with Visualization 
Tools and Techniques for Scientific Visualization 
End-User Data Visualization Systems 
Visualization in Fluid Dynamics 

Applications of Visualization to Scientific, Engineering, 
Biomedical, and Business Problems 

Panels: 

Graphics and Imaging: Trends Toward Unification? 

Matt Ward, Worcester Polytechnic Institute 
Human Perception and Visualization 
Urie Reuter, Bellcore 
Multispectral Visualization 

Ronald M. Pickett and Haim Levkowitz, Institute for 
Visualization and Perception Research, The University 
of Lowell 

How Can Visualization Lead to Breakthroughs in 
Engineering and Science? 

Phil Neray, Alliant Computer Systems 
Interaction Issues in Visualization: Requirements, 
Techniques and Tools 

Hikmet Senay, George Washington University 
Making a Picture Fit the Eye: Human Engineering in 
Computer Graphics 

Lawrence Stark, University of California at Berkeley 
Tools for Visual Data Analysis — User Experiences 
Terry Douglas, Precision Visuals 

Conference Co-Chairs: 

Bruce Brown 
Oracle 
Gary Laguna 

Lawrence Livermore National Laboratory 


Program Committee Co-Chairs: 

Gregory M. Nielson 

Arizona State University 
Larry Rosenblum 

Naval Research Laboratory 

Publicity 

Michael M. Danchak 

Hartford Graduate Center 
Special Advisor 
Kenneth Anderson 
Siemens 

Immediate Past President, IEEE Computer Society 

Case Studies: 

Factors Inducing Periodic Breathing in Humans 
(Syracuse University) 

Non-Linear Engineering Analysis 
(Boeing Computer Services) 

Personal Visualization System 

(Johns Hopkins University/Applied Physics Laboratory) 
Semi-Autonomous Robotic System Visualization 
(Sandia National Laboratory) 

Volume Microscopy (Vital Images) 

Real World Applications of Visualization Solutions 
(Precision Visuals Inc.) 

Interdisciplinary Visualization 

(National Center for Supercomputer Applications) 
The Future Video Telecomputer (Pacific Interface) 
Interactive Investigation of Fluid Mechanics Data Sets 
(Intelligent Light) 

Tutorials: 

Visual Programming Environments 
Ephraim Glinert, RPI 
Knowledge Visualization 

Aaron Marcus, AM Associates 
Computer Vision 

Azriel Rosenfeld, University of Maryland 
Visualizing Multidimensional Data 

William Cleveland, AT&T Bell Laboratories 


Sponsored by 

^^ IEEE Computer Society, 

Technical Committee on 
WK' Computer Graphics 

In cooperation with acm 


ACM Siggraph 



Please Send More Information on Visualization ’90. 

I Name: _;___ 

| Company: ____ 

| Address:____ 

| City:_ State:_ Zip: _ 

I Mail to: 

| Visualization '90, The IEEE Computer Society, 

^730 Massachusetts Avenue, N.W., Washington, DC 20036-1903 


1 


J 















Microsystem Announcements 


Company, Model, Function Comments 


C-Cube Microsystems An image compression board for IBM PC ATs and compatibles. Based on the proposed JPEG 135 
Compression Master/PC standard. Consists of an ISA board with C-Cube’s CL550 processor, which uses a symmetrical 
Image-compression algorithm to compress or decompress still images or motion video by a factor of 20. Runs at 

board speeds up to 10 MHz. Supporting software utilities available. Cost: $3,000. 


Data Translation 
DT2814 

Analog input board 


A laptop-compatible, half-size, analog input board. Runs with IBM PC XT- and AT-com- 136 

patibles. Incorporates a charge pump to generate -12V power from the 12V provided by the lap¬ 
top power supply. Can run in a battery-powered external expansion chassis. Features a 40,000 
samples/s throughput rate. Support software available. Cost: $345. 


Dell Computer 
System 433E 
486 PC 


Extrema Systems 
Octalink 

Speech converter 


Fortron/Source 
Netset EISA 486 
File server 


FPS Computing 
Matrix coprocessor 
Coprocessor 


Information Machines 
C* 

Controller 


860i 

Accelerator card 


Radstone Technology 

68-41 

SBC 


Sony Microsystems 
News 3710 
Workstation 


Texas Instruments 
Travelmate 2000 
Notebook PC 


Veridata Research 
Lappower 286/40 
Laptop PC 


A 33-MHz PC based on Intel’s 80486 CPU and the 32-bit EISA bus. Has six EISA and two ISA 137 
expansion slots. Compatible with MS-DOS, MS-OS/2, and Dell Unix System V. Comes with 4 
Mbytes of RAM expandable to 16 Mbytes, a 16-bit VGA controller, a 5.25-inch 1.2-Mbyte or 
3.5-inch 1.44-Mbyte floppy disk drive, a 101-key keybord, one parallel and two serial ports, and 
a 230W power supply. Cost: from $7,899 (with 80-Mbyte hard drive). 

An eight-channel, IBM PC-compatible, single-slot plug-in board that interfaces with telephone 138 
systems to convert speech to digital format for PC disk storage. Provides software-selectable 
digitizing rates for each channel, from 1.2-4 Kbps, and record/playback of speech. Features 
jumper-selectable addressing and interrupt control. Supports up to 64 channels. Cost: $1,895. 

A file server based on Intel’s 25-MHz 80486 CPU. Comes with a 150-Mbyte hard disk drive, 4 139 

Mbytes of RAM expandable to 32 Mbytes on the motherboard, and an EISA SCSI hard-disk¬ 
drive controller in a 10-bay tower configuration. Features a proprietary EISA/SCSI host adapter 
with a 386SX CPU embedded. Includes AMI BIOS. Base price does not include monitor. Cost' 

$9,500. 

A matrix coprocessor for the Model 500EA Unix supercomputer. Uses Intel 860 numeric pro- 140 
cessors in an expandable architecture. Reportedly delivers 480 Mflops with the minimum con¬ 
figuration of six PEs used in parallel. With 84 PEs, hits a peak of 6.7 Gflops. Comes incorpo¬ 
rated in the Model 500EA. Cost: Model 500EA ranges from $820,000 to $4 million. 

A concurrency and communications controller for parallel processing on the VMEbus. Imple- 141 
ments in hardware the synchronization primitives normally implemented at the operating system 
and application levels. Works with different CPUs if the processor boards have a standard VSB 
interface. Comes with the Softprobe software development environment. Cost: $4,000. 

A 64-bit, single-slot, RISC accelerator card for the Apple Macintosh. Developed for use as a 142 
rendering engine for Pixar’s Mac Renderman. Comes with 8 Mbytes of RAM expandable to 32 
Mbytes, SCSI port, two serial ports, Intel 860 chip with 64-bit data path, and Transputer link for 
connection to the Levo Translink system. Cost: from $5,500. 

A VMEbus microprocessor board built on Radstone’s Freeflow+ Architecture and Motorola’s 143 
68040 CPU (when available). Will also include a separate 68020 processor to control the SCSI, 
Ethernet, and serial on-board I/O subsystems, freeing the 68040 for uninterrupted data process¬ 
ing. Will provide multiple local and external buses. Prices not set. 

A Unix workstation based on the 20-MHz R3000 RISC CPU and R3010 coprocessor from Mips 144 
Computer Systems. Provides full support of Kanji ideograms, OSF’s Motif GUI, and MIT’s X 
Window System windowing protocols. Configured with a 286- or 640-Mbyte internal drive. 

Cost: from $6,800 for a diskless sytem to $18,200 for a 640-Mbyte system with color monitor. 

A 4.4-pound notebook PC based on Intel’s 12-MHz 80C286 CPU. Features a VGA display, an 145 
internal 20-Mbyte hard disk drive, battery, and 101/102-key keyboard layout with 79 keys, in¬ 
cluding an embedded numeric keypad and dedicated cursor control pad. MS-DOS and Laplink 
reside in ROM and are resident on the preformatted hard drive. Cost: $3,999. 

A 286-based laptop operating with a clock speed of 16 MHz, switchable to 8 MHz. Comes with 146 
1 Mbyte of RAM expandable to 8 Mbytes, VGA screen, EMS support, 128 Kbytes of ROM for 
BIOS, 256 Kbytes of video RAM, 3.5-inch 1.44-Mbyte floppy disk drive, and 40-Mbyte hard 
disk drive. Operates with DOS, OS/2, DR DOS, and Xenix. Cost: $3,150. 


100 


COMPUTER 










CALL FOR PAPERS 



General chair: 

Luis Felipe Cabrera 
IBM ARC 

Local arrangements: 

Noah Mendelsohn 
IBM Cambridge 
Publicity chair: 

Ken Kane 

SUN Microsystems 
Publications chair: 

Dorothy Marsh 

Cornell University 
Hardware exhibits: 

Pat Mantey 

U. C. Santa Cruz 
Program co-chairs: 

Ken Birman 

Cornell University 
Keith Marzullo 

Cornell University 
Program committee: 

Anita Borg 
DEC WRL 
Thomas Joseph 
Olivetti ORC 
Gail Kaiser 

Columbia University 
Susan Owicki 
DEC SRC 
Mike Powell 

SUN Microsystems 
Marc Rozier 

Chorus Systems 
M. Satyanarayanan 

Ca^negie-Mellon University 
Frank Schmuck 
IBM ARC 
Henry Sowizral 
NASA RIACS 
Doug Terry 
Xerox PARC 
Walter Tichy 

Universitdt Karlsruhe 
Robbert Van Renesse 
Vrije Universiteit 
Robin Williams 
IBM ARC 
Paulo Verissimo 
INESC Portugal 
Greg Zack 
Xerox DRI 


Third IEEE Conference On Computer Workstations: 
Accomplishments And Challenges 

Sponsored by the IEEE Technical Committee on Operating Systems (TCOS) 

The Sea Crest Resort, Falmouth, Cape Cod (Massachusetts) 

May 15-17, 1991 

As we enter the 1990's, changes in technology will require rethinking the role of the 
workstation in the computing environment. Gigabit communication, desktop parallel 
computing, and multimedia applications are now emerging. The key to effective comput¬ 
ing in this new world is the interface between the user and the computing environment: 
the workstation. What challenges must be overcome to make effective use of emerging 
technologies? CCW '91 seeks to foster dialogue between builders of workstation-based ap¬ 
plications and technological innovators. Papers may focus on experiences with ambitious 
applications as well as on research topics. Topics include: 

• Design of workstation computing environments 

• Workstation and system architecture 

• Application and system management 

• User interface technologies 

• Exploiting parallelism and massive memory 

• Network support for high performance distributed computing 

• Computer-aided software engineering 

• Information management systems 

• Real-time sensing and control 

• Issues of scale 

• Innovative ideas and technologies 


Papers should be no longer than about 5000 words (20 double-spaced pages), and must 
be received by September 15, 1990. Authors will be notified of acceptance by December 
1, 1990, and final camera- ready copy is due by January 15, 1991. Both technical and 
case-study papers are solicited; case studies should describe existing systems and include 
performance or operational data where practical. 

The conference will also include a poster session for discussing work in progress. Individ¬ 
uals with a specific interest in participating in the poster session are invited to submit 
a one-page abstract describing their project. In addition, the program committee will 
invite the authors of some of the submitted papers to present their work in the poster 
session. 

Send five copies of each submission to: 

Prof. Keith Marzullo 

Program co-chair, CCW ‘91 

Department of Computer Science, Upson Hall 

Cornell University 

Ithaca NY 14853 

Important dates: 

Submissions due September 15, 1990 

Notification of acceptance December 1, 1990 

Camera-ready copies due January 15, 1991 






Call for Papers 

THE SECOND INTERNATIONAL CONFERENCE ON 
SYSTEMS INTEGRATION 


H IEEE COMPUTER SOCIETY 




fcv THE INSTITUTE OF ELECTRICAL AND 
T ELECTRONICS ENGINEERS, INC. 


Headquarters Plaza Hotel, Morristown, New Jersey 
April 22-25, 1991 

Theme: Managing Large-Scale Integration in the 1990s. 

InLwi?° nfer f^ Ce ,0 . cuses °. n ,he integrals of technologies, processes and systems, and the development of mechanisms and tools 
enabling solutions to complex multi-disciplinary problems. A special emphasis is placed on the management of large-scale integration. 
The conference will provide an international and interdisciplinary forum in which researchers and practitioners can share novel research, 
engineering development, and management experiences. Papers should deal with recent work in theory, design, implementation, utilization 
and experiences of integrated processes and systems. Topics to be addressed include, but are not limited to: 


i„|j r0 ? eS A M ° de ! ing and Characterization • Re-engineering and Process Simplification • Integration Process in Military, Business and 
mdustry Applications • Next Generation Computer Aided Environment for Engineering Design Manufacturing, System Development etc • 
Hole of Human Engmeering m Large-scale Integration • Experiences of Large-scale Integration Projects • The Implication of Systems 
and^ystemT Manpower Skllls ‘ Quall,y C° n,ro1 in Large-scale Integration • System Architecture for Integration • Automatiom of Processes 

u, f ° r ^ atio i 1 and Instructions for Authors: Authors are cordially invited to submit original technical papers to the Program Chairman no 
later than September 14, 1990. All papers must be in English, typed in double spaced format, and may not exceed 6,000 words. Each 
te l |eoh SSIOn Sh h Ud P r ,d9 f^°c e |Y Pa9e COntaining author ( s )- affiliation(s), complete address(es), identification of principal author, and 


telephone number. Also include SIX copies of complete text With a title and abstract".'Notice’of acceptance^willbe mailed'to'the principal 
1990. If accepted, the author(s) will prepare the final manuscript in time for inclusion in the conference 


author(s) by December 3, 1 


proceedings and will present the paper at the conference; otherwise, the author(s) * 
must sign a copyright release form. 


I incur a page charge. Authors of accepted papers 


Please send SIX copies of your paper(s) to 

Program Chairperson: 

Dr. Raymond T. Yeh 

c/o Prof. Peter A. Ng 

Dept, of Computer & Information Science 

New Jersey Institute of Technology 

University Heights 

Newark, NJ 07102, U.S.A. 


Paper Arrival Deadline: 
September 14,1990 

Acceptance Notification Deadline: 
December 3, 1990 
Final Manuscript Inclusion 
Deadline: January 7, 1991 


For further information contact Peter A. Ng, Department of Computer and Information Science, New Jersey Institute of 
Technology, University Heights, Newark, NJ 07102, U.S.A., (201) 596-3387, ng_p@vienna.njit.edu 


Honorary Conference Chair: 
Conference Chair: 

Program Chair: 

European Program Co-Chair: 

Pacific Program Co-Chair: 


CONFERENCE COMMITTEE 

Laurence C. Seifert, AT&T No. American Program 

Co-Chair: 

C. V. Ramamoorthy, 

UC Berkeley 

Raymond T. Yeh, 

Syscorp Int’l 

Herbert Weber, 

University of Dortmund 


Fumihiko Kamijo, IPA 


Local Arrangement Chair: 
Steering Committee Chair: 


Fuad Sobrinho, 
Software Plant Project 

Valdis Berzins, 

Naval Postgraduate 

Roxanne Hiltz, NJIT 

Peter A. Ng, NJIT 


JeSTrSitmio 6 ! Tannin' Systems (INIS) and Department of Computer and Information Science at New 

luer Ma,Bema,ik u " a r> a,en ve,arbei,ung (GMD), AT&T, 










PRODUCT REVIEWS 


Editor: Richard Eckhouse, UMASS-Boston, Harbor Campus, Boston, MA 02125, Compmail+, r.eckhouse; Bitnet, eckhouse@umbsky; CompuServe, 70516,556 

Image processing on the Macintosh 

Robert Morris, University of Massachusetts—Boston 


This review discusses three Macin¬ 
tosh-based image processing packages. 
One is in the public domain and avail¬ 
able at little or no cost. The other two are 
commercial, ranging in list price from 
$499 to $2,990. 

Although cost and capabilities cer¬ 
tainly interrelate here, the relationship is 
by no means simple. Each package will 
find users who can use it but not the oth¬ 
ers, and each will find users satisfied to 
pay the asking price. In the review, I de¬ 
vote the most attention to the midpriced 
package, IPLab. This resulted, in part, 
from my interest in frequency domain 
manipulation, which the other two pack¬ 
ages do not support well. If I were more 
interested in quantitative analysis, my fo¬ 
cus might well shift to the higher priced 
Ultimage package (although the free Im¬ 
age product is very competent also) and 
perhaps unreviewed competitors. 

Following a more extensive review 
of IPLab and rather brief discussions 
of Image and Ultimage, I’ll compare 
some of the important common points, 
including printing, documentation, and 
extensibility. 

IPLab 

IPLab from Signal Analytics is a com¬ 
prehensive image processing package for 
Macintosh computers with 8-bit color 
cards (principally Mac IIs, but you can 
get third-party cards for SE-30s). I used 
the software on a standard Mac II with 
Apple’s 8-bit color card and the Apple 
color monitor, but mainly with the Apple 
monochrome (gray-scale) monitor. The 
system had 5 Mbytes of memory in¬ 
stalled, and I generally ran it under the 
Finder. My personal experience with 
Multifinder leads me to avoid it when¬ 
ever I do critical work. In any case, im¬ 
age processing software is so memory in¬ 
tensive that you will probably want to 
save as much memory as possible by 
running under the Finder. 

IPLab reportedly supports both Post¬ 
script and Quickdraw printers. I tested it 
only with the Postscript-based Laser¬ 
Writer Plus connected with Appletalk. 

IPLab supports the standard range of 


things you would want in image process¬ 
ing software. This includes 

• the ability to specify linear filter 
masks up to 5x5 pixels for both finite 
and infinite impulse response filters, 

• binary arithmetic and logical opera¬ 
tions between two images, 

• complex multiplication between two 
images (the real and imaginary parts 
are stored in separate real images; the 
Fourier transform command, for ex¬ 
ample, produces such complex im¬ 
ages), 

• the addition of parameterized linear 
ramps and gaussian or uniform noise, 
and 

• a useful collection of parameterized 
point functions, including linear, 
logarithmic, exponential, trigonomet¬ 
ric, clipping, and a few bit-wise point 
functions. 

Erosion, median, and dilation filters are 
also provided, with masks up to 5x5. 

With the embedded script language, 
described below, these capabilities pro¬ 
vide a powerful collection of tools. In 
addition, a simple display renormaliza¬ 
tion facility and several color lookup 
table (CLUT) manipulations make it easy 
to do various visual enhancements such 
as false color, often desired for the ex¬ 
amination of medical or satellite images. 
You can also supply specialized CLUTs 
derived from other interactively modified 
Macintosh images, supplied with the 
scripting language (which doesn’t seem 
well suited to this type of task), or gener¬ 
ated by Pascal or C programs through the 
optional interface provided to Apple’s 
Macintosh Programmer’s Workshop 
(MPW) compilers. 

I will describe two tasks I used to test 
the software a little beyond its relatively 
easy-to-use Mac-faithful point-and-click 
user interface. In fact, most serious use 
of IPLab involves more than a few steps, 
and the scripting language provided will 
prove very useful. Indeed, the vendor 
provides some standard processing tools 
(such as edge detectors) implemented 
with these scripts. Once a script is writ¬ 
ten, you will barely notice that it’s not a 
single IPLab command. 


The first task. For the first problem, I 
wanted to apply a filter whose support 
exceeded 5x5 pixels to an image in PICT 
format captured by other software from a 
frame grabber. The capture itself took 
more than one step, because the software 
does not directly support the frame grab¬ 
ber hardware I used. (I advise readers to 
contact the vendor about optional video 
capture support, because of the growing 
support for a wide range of hardware.) 

The product does support most of the 
standard image interchange formats. So, 

I used another piece of software to do 
the capture, wrote out the image, then 
read it with IPLab. I intended to process 
a natural image with a few circularly 
symmetric difference-of-gaussian filters, 
which emulate certain of the early stages 
of the human visual system. These filters 
were generated by software written in 
Think Pascal by Hugh Wilson at the Eye 
Research Laboratories of the University 
of Chicago. That software produces the 
filters in PICT format, which IPLab im¬ 
ports directly. Once IPLab read the fil¬ 
ters and image, I took the Fourier trans¬ 
form of each and applied the complex 
multiplication operation between the re¬ 
sults, then applied the inverse transform. 

The above description actually omits 
an important step required due to the 
Macintosh Quickdraw software’s some¬ 
what unpleasant representation of im¬ 
ages. In essentially all applications 
where display is necessary, Quickdraw 
represents black as 255 and white as 0, 
but all intermediate 8-bit intensities as 
increasing from 1 to 254. The resulting 
complications and how to deal with them 
are well explained in the IPLab manual. 
Roughly, they entail either sacrificing 
some intensity range or tolerating dis¬ 
play oddities in purely black or purely 
white regions during intermediate com¬ 
putations. It is fairly easy to forget to 
take the recommended steps before 
doing operations that convert 8-bit pixel 
values to real numbers. The learning 
process entails tolerating the oddities at 
the end, too. 

The second task. In a second test, I 
wanted to use the software to interac¬ 
tively remove sinusoidal noise from the 


August 1990 


103 






image. I planned to use the standard 
technique of isolating the resulting im¬ 
pulse in the Fourier magnitude spectrum 
and reducing its amplitude to approxi¬ 
mate that of its neighbors. Actually, 

I had no noisy image at hand, so I did 
the reverse and introduced the noise as 
follows: 

First, I loaded the previously saved 
transform. I produced and saved it as two 
images: magnitude and phase. You can 
also produce real and imaginary parts, 
which you might find more useful for 


You can do 
the filtering with 
the complex 
multiply operation 
in one step 
in the frequency 
domain. 


applying filters that are not zero phase, 
since you can do the filtering with the 
complex multiply operation in one step 
in the frequency domain. Otherwise, you 
must multiply the magnitudes and add 
the phases of filter and image before 
back-transforming. 

Second, I moved the cursor to the 
point (64, 64) in the 512x512 magnitude 
image. Since the default FFT behavior 
puts the origin at the center, this corre¬ 
sponds to the point (-64, -64) in the fre¬ 
quency domain. Typical in such soft¬ 
ware, a separate status window can be 
raised to show the pixel position of the 
cursor. At that point I defined a 1-pixel 
wide region of interest (ROI), a rectan¬ 
gular subset of the image to which you 
apply operations. (IPLab also supports 
defining polygonal subsets, but much 
processing is essentially done on their 
rectangular bounding boxes.) 

To this ROI I applied the linear point 
function that maps x to (ax+b)/c, specify¬ 
ing a=0, c=l, and 6=1,000,000.1 chose 
this value by putting the cursor at (128, 
128) (that is, (0, 0) in the frequency 
plane). The status window told me that 
the value there, which is the DC compo¬ 
nent and so has maximum magnitude in 
the spectrum, was 6e+7, so my choice 
was rather large, as desired. At the time, 

I found it a minor inconvenience that the 
linear point parameters must be entered 
as integers, especially since the entry 
window is too small to make seven digits 
visible. (However, text entry windows 


scroll on the Mac. See my complaint be¬ 
low about long file and window names.) 
I have doubts that high accuracy ma¬ 
nipulation is actually required in this 
circumstance, and the renormalization 
commands probably can make things 
fall in a suitable range to abide by this 
restriction. It does seem artificial to me, 
though. 

I did a similar manipulation at (192, 
192), corresponding to (64, 64) in the 
center origin coordinate system. I could 
have simplified the repeated part of the 
operation somewhat with the IPLab 
scripting language, but it appears to me 
that that language has only limited abil¬ 
ity to manipulate variables, and I might 
well have been left with some interac¬ 
tive work to do. 

With the impulse thus added to the 
spectrum — corresponding to a diago¬ 
nally oriented sinusoidal pattern in im¬ 
age space — I applied the inverse 
Fourier transform to obtain an image 
which might well have been generated 
by an imaging system suffering from in¬ 
terference at a fixed spatial frequency. 

The reverse process, removing the 
noise, would be substantially similar, 
although you might need to renormalize 
the magnitude if the noise were of suffi¬ 
ciently low power to appear black in the 
magnitude image window (unless you 
already knew its frequency and could 
isolate it by simply moving the cursor 
to the appropriate point in the magni¬ 
tude window and acting there on faith). 

Comments. Both these exercises left 
me pleasantly confident that IPLab eas¬ 
ily handles manipulations beyond those 
explicitly envisioned by the scripts sup¬ 
plied with the software. 

My criticisms of IPLab are almost 
quibbles, or perhaps suggestions for fu¬ 
ture enhancements. I think the scripting 
language could provide greater utility if 
it had more direct subroutine mecha¬ 
nisms and generalized the treatment of 
variables (they must be integers, but 
many operations are usefully applied to 
real and complex valued objects). In¬ 
deed, a number of the facilities seem 
unnecessarily restricted to integers. 

On the interactive side. I’d like to see 
some simple arithmetic permitted di¬ 
rectly in the numerical input windows 
so that, for example, you need not think 
or resort to a desk accessory for such 
things as finding the point 3/4 of the 
way down the image. For processing in 
the spectral domain, it would help to 
permit disconnected ROI specifications. 
Indeed, such regions might be useful in 
the image domain in applications — 
such as brain scans — having a high de¬ 
gree of symmetry, where the user might 
wish to process two symmetric parts in 


tandem. Also, for processing in the fre¬ 
quency domain, circular and elliptical 
ROIs might help, although the resulting 
complications in computation could 
conceivably mitigate against this. 

I would like to see the ability to spec¬ 
ify the origin in the status window 
(where the current cursor position is dis¬ 
played). As indicated above, this could 
simplify interactive processing in the 
frequency domain. 

Another useful feature would permit 
write-locking of files from inside IPLab. 
Users not running under Multifinder 
cannot easily protect hours of work 
against their own mistakes without exit¬ 
ing IPLab to do so. 

The internal file format for images 
should be provided, so that users pro¬ 
ducing images from their own programs 
can generate them directly without 
going through one of the standard inter¬ 
change formats. Signal Analytics has 
indicated that they will give this on re¬ 
quest, but I feel it should be included in 
the distributed documentation. 

Finally, I would like to see interfaces 
to Think Pascal and C supported, as the 
scientific audience of this product is 
more likely to use them than Apple’s 
MPW compilers. 

Interface. The user interface is quite 
faithful to the Macintosh conventions. 
Image processing shares so little with 
most other applications that the one ex¬ 
ception I found to “the Macintosh Way” 
was only a passing annoyance. No key¬ 
board shortcut exists for printing, and 
the traditional one, Command-P, does 
something else. This got me in a little 
trouble when I started with the standard 
Mac user’s fantasy that the Mac user in¬ 
terface is so consistent, you never need 


IPLab easily handles 
manipulations beyond 
those explicitly 
envisioned by the 
scripts supplied 
with the software. 


to read the application manual to do 
something useful. 

Information and documentation. In 

addition to the strong capabilities and 
competent user interface of IPLab, Sig¬ 
nal Analytics provides registered users 


104 


COMPUTER 











with a quarterly newsletter devoted to 
processing techniques and tutorials (as 
well as product announcements) and in¬ 
vites users to submit articles. The second 


Signal Analytics 
provides users a 
quarterly newsletter 
devoted to 

processing techniques 
and tutorials. 


issue, which came with the software, had 
a brief, well-written article on edge de¬ 
tection and an implementation of it in the 
scripting language. 

IPLab’s documentation is clear, com¬ 
plete, and concise. I strongly advise new 
users to work through the instructive in¬ 
troductory tutorial chapters. While I con¬ 
cede the introduction’s point that “it is 
not necessary [to have an understanding 
of image processing methods and termi¬ 
nology] for an understanding of the man¬ 
ual,” it would be somewhat pointless to 
use either the documentation or the soft¬ 
ware without at least some such under¬ 
standing. Neither attempts to teach image 
processing, but they would make a fine 
addition to many of the standard texts on 
the subject. 

After I had written this review, I re¬ 
ceived IPLab 1.1 but did not test it. Ac¬ 
cording to the documentation, the vendor 
has enhanced IPLab’s measurement ca¬ 
pabilities, supplied more morphological 
operation scripts, and released or an¬ 
nounced several additional optional 
frame-grabber support packages. Most 
welcome is a complete description of the 
IPLab image file format included in the 
latest issue of the IPLab newsletter, 
which also contains an example of the 
kind of frequency domain filtering de¬ 
scribed above. 

Recommendations. In summary, I can 
recommend this product highly to any¬ 
one needing to do sophisticated image 
processing. It would also make a first- 
rate foundation for an image processing 
course laboratory. 

IPLab is available from Signal Analyt¬ 
ics, 374 Maple Ave. E., Suite 200, Vi¬ 
enna, VA 22180, phone (703) 281-3277 
or (703) 281-2509. It has a list price of 
$499. The user customization option lists 
for an additional $150. 

Reader Service 21 


Ultimage 

Ultimage from GTFS is a high-end 
image processing product that lists at a 
price four to six times that of IPLab, and 
versions of it have been on the market 
longer. The price brings with it a number 
of features not available in IPLab, and 
some production users might find these 
necessary and thus worth the higher 
price. The main such features are the 
support for image processing add-on 
hardware and more sophisticated mor¬ 
phological and quantitative image analy¬ 
sis. On the other hand, the extensibility 
of IPLab makes many of the software 
features easy to implement, and time will 
likely narrow the gap between these two 
products. For the most part, Ultimage 
will be preferable principally to those 
people, such as medical researchers, who 
have neither the time nor expertise to im¬ 
plement analysis programs in an exten¬ 
sion language, as required by the less ex¬ 
pensive product. 

The standard features of Ultimage 
closely resemble those of IPLab and Im¬ 
age 1.27 (described below). Ultimage has 
more flexible region selection, including 
regions with disconnected components. It 
permits measurements to be scaled in 
standard units (as does Image 1.27. IP¬ 
Lab, however, permits you to specify 
unit names when using the measurement 
tools. Such an enhancement seems too 
simple to be omitted from the other pack¬ 
ages). In turn, this permits feature analy¬ 
sis directly in the software without cum¬ 
bersome computations that relate the 
sampling rate to the original source of 
the data. 

For example, using the particle detec¬ 
tion capabilities of Ultimage, a medical 
researcher can easily ask for statistics 
and display of all objects in the image 
that have intensity in a given interval and 
whose area falls within a given interval 
specified in mm 2 . Such techniques could 
isolate objects of special interest from 
bioscience, satellite, or electron micro¬ 
scope images. You need not know how 
many pixels comprise the feature you 
seek. You can plot statistical distribu¬ 
tions for the objects so defined, with re¬ 
spect to one of the defining parameters, 
and make scatter plots with respect to 
pairs of parameters. 

The most serious drawback I found to 
Ultimage was its special treatment of fre¬ 
quency plane windows resulting from 
Fourier transforms. Only one such plane 
can be kept at any one time. Worse, al¬ 
though you can apply processing to these 
windows, such processing only affects 
the display and cannot affect the trans¬ 
form itself. Thus, you cannot easily ma¬ 
nipulate the Fourier domain from within 


the program. This would make difficult 
either of the techniques mentioned above 
(removal of sinusoidal noise and applica¬ 
tion of filters with kernels larger than the 
maximum supported by the software — 
7x7 for Ultimage). The documentation 
for the programming interface suggests 
that you can overcome this lack by writ¬ 
ing stand-alone code, but such a solution 
seems excessively complex to me. How¬ 
ever, Ultimage does provide linear at¬ 
tenuation and ideal filtering in the fre¬ 
quency domain. 

All Ultimage windows can be printed, 
which is an important advantage in scien¬ 
tific applications where publishing the 
analysis is at least as important as pub¬ 
lishing the raw data, even when the data 
are pictures. 

Ultimage supports 3D representations 
of gray-level data. Fourier magnitudes 
are substantially easier to interpret when 
viewed this way. In addition, Ultimage’s 
default normalization for Fourier magni¬ 
tudes provides a more useful view than 
IPLab’s, which requires an additional 
renormalization step to get an under¬ 
standable picture of the magnitude. 

When I attempted to read a 512x512 
PICT format image created with IPLab, 
Ultimage insisted on truncating it to 


All Ultimage windows 
can be printed, 
which is an important 
advantage in 
scientific applications. 


512x480, the latter being the screen 
height. I don’t know whether this repre¬ 
sents IPLab violating PICT specifica¬ 
tions on writing or Ultimage making 
unwarranted assumptions about some 
optional paramaters, but Image 1.27 was 
able to read this file, and neither Image 
1.27 nor IPLab had any difficulty read¬ 
ing files written by Ultimage. When I 
saved the image with IPLab in TIFF for¬ 
mat, Ultimage had no difficulty reading 
it. Customer support at GTFS said that 
the truncation is not the expected behav¬ 
ior of the Ultimage software and offered 
to examine the file. 

Ultimage is available from GTFS, 

2455 Bennet Valley Rd„ 100C, Santa 
Rosa, CA 95404, phone (707) 579-1733, 
fax (707) 578-3195.1 tested version 
1.3.4, which has a list price of $2,990. 

Reader Service 22 


August 1990 


105 









Image 

I will report only briefly on Image, in 
part because you can (and should) get it 
for free and try it for yourself. I got my 
copy from the Software Exchange Li¬ 
brary of the Boston Computer Society, 
but many bulletin boards and some other 
sources carry it. 

Again, this product provides the stan¬ 
dard image manipulation features in an 
easy-to-use form. Image 1.27 does not do 
fast Fourier transforms — a major ob¬ 
struction to its use as a teaching tool. 
However, according to the documenta¬ 
tion, an experimental version available 
from Internet sites does. Although fre¬ 
quency domain manipulation is thus not 
possible in Image 1.27, it does accept 
convolution kernels up to 63x63 pixels 
in size, somewhat reducing the need to 
pass to the frequency domain to imple¬ 
ment large filters. Note, however, that 
this still requires you to design the filter 
as a convolution, which might take more 
expertise and software than many users 
have. 

Like Ultimage, Image 1.27 has 
strengths in feature identification. It too 
provides tools for measuring in user- 
specified units and has a wider variety of 
quantitative analysis capability than does 
IPLab, in some cases approaching the 
variety of Ultimage. 

Image has some pixel manipulation 
capabilities not found in the other two 
packages. Separate paint programs might 
well be satisfactory for occasional image 
touch up, but users with many images 
needing retouching before processing 
could save time with this capability, es¬ 
pecially if, as I do, they prefer to avoid 
the Multifinder. For example, one image 
I worked on, a video capture, had ex¬ 
treme reflection from the subject’s eye 
glasses in the upper comer, making it 
difficult to isolate one of the subject’s 
eyes. Minor airbrushing before process¬ 
ing alleviated this problem without the 
need to resort to a separate package. 

The main difficulty I see with Image 
1.27 as a production tool is the level of 
sophistication required to customize it 
and simplify complex repetitive steps. 

This objection also holds (although 
somewhat less) for Ultimage. In both 
cases, people working with a lot of im¬ 
ages and using long sequences of pro¬ 
cessing steps will have to invest more 
time in developing shortcut tools than 
with IPLab. 

Version 1.29 of Image was released on 
May 31, 1990. It reportedly has an em¬ 
bedded Pascal-like macro language. Im¬ 
age 1.27 is available from the National 
Technical Information Service (NTIS) 
and many Macintosh bulletin boards and 
user groups. It may be freely copied, dis¬ 


tributed, and modified. 

You can obtain Image from (a) NTIS, 
phone (703) 487-4650, order number 
PB90-500687 ($100 check, Visa, or 
Mastercard); (b) via anonymous FTP 
from sumex-aim.stanford.edu [36.44.0.6] 
(the application is in /info-mac/app, and 
the source is in /info-mac/source); or (c) 
via anonymous FTP from alw.nih.gov 
[128.231.128.251] in the directory 
/pub/image. 


Reader Service 23 


Extensibility and scripting 

I didn’t examine the extension and 
scripting tools of these packages. I based 
my comments below on brief examina¬ 
tion of the documentation. 

Of the three packages, IPLab has the 
only scripting language. For the others, 
you have to shorten repetitive tasks using 
one or another macro recorders available 
for the Mac. (Ultimage provides Auto- 
mac, a fairly sophisticated third-party 
macro recording system, which I did 
not test.) It might be a matter of taste 
whether the embedded language or exter¬ 
nal macro approach works better, but the 
IPLab language permits raising dialogue 
boxes that can alter the parameters of the 
scripts. It also permits limited program¬ 
mability, including conditional control 
based on variables from the processing 
commands. This is unlikely to be so 
simple with separate macro packages. 

Some tasks simply cannot be repre¬ 
sented conveniently with scripts. All 
three packages provide interfaces to 
stand-alone user-written programs. 
Ultimage’s programming interface 
comes with source code. This should 
make it somewhat more general than that 
of IPLab, which consists of object librar¬ 
ies for Apple’s MPW C and MPW Pascal 
and Language Systems’ Fortran. Both 
Ultimage and IPLab give you access to 
internal data structures of the image 
processing facilities. 

Ultimage has more extensive docu¬ 
mentation and includes the programming 
tools without charge. The IPLab toolkit 
is an extra-cost ($150) option. The popu¬ 
larity of Symantec’s Think Pascal and 
Think C ought to induce the vendor to 
support them for IPLab. 

In the case of Image 1.27, you can get 
the full source code for this public do¬ 
main program from the authors (it was 
not part of the distribution I received 
from the Boston Computer Society). This 
means that sufficiently skilled program¬ 
mers can remedy any and all shortcom¬ 
ings by adding their own modifications 
to the sources. Although the current dis¬ 


tribution is written in Think Pascal (an 
admirable choice, in my opinion, since it 
is widely available, inexpensive, and 
very good), the authors’ documentation 
claims that it should be portable to other 
Pascal systems. 


Frame grabber support 

Two of these packages directly sup¬ 
port various popular frame grabber hard¬ 
ware. I didn’t test any of these. Since all 
can import image files in several stan¬ 
dard formats, occasional users of frame 
grabbers can probably make do with al¬ 
most any software to produce the image 
and then import it. Users should check 
with the vendors for the current status of 
support of their favorite frame grabber. 

According to the relevant documenta¬ 
tion, Image supports the Data Translation 
Quick Capture and Image Systems Tech¬ 
nology (formerly Scion) Video Image 
1000. IPlab supports the Quick Capture 
card and has Scion support “in the 
works.” The vendors of Ultimage sell a 
proprietary card. In all likelihood, the 
public domain nature of Image will result 
in much contributed support, and users 
of the commercial products might find 
that they use Image for data acquisition 
for later processing by the other package. 

Color 

I did not make any significant tests of 
the color capabilities of any of this soft¬ 
ware. Pseudo color can be especially im¬ 
portant for some kinds of interactive 
processing, such as feature identification. 
Because the Macintosh world is in the 
process of converting to the new 24-bit 
Quickdraw color standards, I feel users 
to whom color is important might want 
to see if and how this kind of software 
supports the new environment in the near 
future. 


Generic complaints 

Each package refuses to start on a sys¬ 
tem without an 8-bit color card. Cer¬ 
tainly, no useful appearances can be gen¬ 
erated with less than 8 bits for gray-scale 
images. However, all these packages do 
a number of useful things for which the 
output does not require gray scale. These 
include histograms and other graphs and 
even perhaps of some processed images, 
like the result of edge detection. I would 
prefer to see these packages simply warn 
of the consequences and let people do 
whatever work they consider useful on a 
Mac without 8-bit color. In principle, the 


106 


COMPUTER 






much vaunted Macintosh Quickdraw 
generality should make this reasonably 
feasible. However, it might be compli¬ 
cated by the aforementioned Mac con¬ 
vention regarding the representation of 
black and white among gray levels. Pre¬ 
sumably, licensing issues also affect the 
commercial products, which are typically 
licensed for one machine per copy. 

Finally, the famed Macintosh uniform 
user interface is (as with most Mac soft¬ 
ware) sometimes as obtrusive as useful. 
The most tiresome example — certainly 
not the fault of the authors of these 
packages — is that some dialogue 
boxes that display text do not provide 
any mechanism for scrolling text larger 
than what fits in the box. This makes it 
difficult to select from among files and 
windows whose names have a common 
first part. This happens more than a little, 
because the names of processing output 
windows are usually derived from those 
of input windows by simply adding a few 
characters. 


Documentation 

The documentation for each package 
presumes familiarity with the Macintosh 
user interface. Subject to this, all three 
have clear and complete documentation. 
The commercial packages have extensive 
tutorials, whereas the public domain one 
has a one-page example on analyzing 
electrophoretic gels and a few pages of 
Q&A about common problems. While 
you can hardly complain at the price of 
Image, you should bear in mind that 
most use of these tools involves more 
than one processing step, and the learn¬ 
ing curve might well be steeper without 
tutorials to illuminate the processing phi¬ 
losophies lurking in the software. 


Review notes 

PC-Kwik Power Disk. I can list at 
least six reasons to recommend PC-Kwik 
Power Disk, a program to provide disk 
file defragmenting, testing, file locating, 
and bad sector remapping. First, it is 
very flexible. A number of options allow 
you to tailor this program to fit your 
needs exactly, and it can easily be set up 
to run in batch mode. Second, it is very 
versatile. You can use Power Disk on 
both fixed and removable disks, whether 
they are MFM, RLL, ESDI, or SCSI hard 
drives. Additionally, you can divide the 
disks into multiple partitions larger than 
32 Mbytes in size. Third, the software is 


Printing 

Insufficient attention is paid to gener¬ 
ating pure Postscript files instead of 
printing to Postscript printers — impor¬ 
tant for typesetting publication-quality il¬ 
lustrations. Various standard tricks for 
causing the print dialogue to generate a 
Postscript file seemed to have no effect 
on any of these packages. This situation 
is only likely to get worse, since Apple 
appears to be attempting to pry itself 
loose from the de facto Postscript stan¬ 
dard and impose its own proprietary im¬ 
aging models. Software developers 
should not play into this and should pro¬ 
vide ways to generate Postscript files 
that do not assume the software is run¬ 
ning on hardware directly connected to 
Apple printers. 

All windows should be printable. This 
holds largely true in Image 1.27, rela¬ 
tively so in Ultimage, and unfortunately 
not in IPLab. I found no way to print his¬ 
tograms in IPLab, although you can dis¬ 
play them. 


Performance and 
requirements 

All packages require a Mac II with 8- 
bit color cards and at least 2 Mbytes of 
memory. Image 1.27 and IPLab docu¬ 
mentation recommend at least 4 Mbytes, 
and Ultimage documentation recom¬ 
mends 5 Mbytes. 

Both Ultimage and IPLab took about 
65-70 seconds to do a 256x256 Fourier 
transform on a standard Mac II (with a 
16-MHz 68020 CPU) and about five 
times that for 512x512 images (approxi¬ 
mately the factor predicted by algorithms 
executing in N\ogN time in each direc¬ 


set up to reorganize your disk safely, 
even during reboots or power loss. 
Fourth, it does defragmenting faster than 
any other utility.! have used, on the order 
of two minutes'or less for a 40-Mbyte 
partition. Fifth, it is very fast. And sixth, 
it is very, very fast. So, while speed 
might not be everything, it surely is criti¬ 
cal if you choose to keep your disks de¬ 
fragmented on a regular basis. 

System installation is done for you 
through an install program. You can eas¬ 
ily drop the distribution floppy into your 
machine, run the install program, and 
start using the software without ever 


tion, as the FFTs would). The sales lit¬ 
erature for Ultimage claims a 25-30 fold 
speedup for the associated accelerator 
hardware. A colleague reports that IPLab 
took 26 seconds for a 256x256 transform 
with his third-party 40-MHz upgrade to 
a Mac II and, again, about five times 
longer for a 512x512 image. 

For each package, printing a 256x256 
image returned system control to me in 
about 40 seconds on a Mac connected to 
a LaserWriter Plus via Appletalk. Larger 
images will probably benefit from some 
kind of spooling. Postscript represents 
byte images in hex ASCII, doubling the 
storage requirement of the image. Thus, 
a 512x512 8-bit image will require about 
512 Kbytes to be sent to the printer. 
Communication time can be significant 
in this case, and a spooler should give 
you back control faster. I didn’t make 
any spooling tests, but a colleague re¬ 
ports that IPLab works fine with the 
TOPS spooler. 


Summing up 

Start with Image and see if it meets 
your needs. For a general package, espe¬ 
cially for a course in image processing, 
IPLab proves the most suitable. Al¬ 
though an outstanding value for a single 
copy, IPLab’s quantity discounts (5 per¬ 
cent for 3-5, 10 percent for 6-10, 20 per¬ 
cent for 11 or more) are not sufficient to 
encourage a large course lab. Ultimage 
supports the most sophisticated quantita¬ 
tive analysis and hardware accelerators, 
and this might be worth the six-fold 
price multiple over IPLab. An untested 
version, Ultimage/S, has a list price of 
$2,000 and excludes most of the features 
that give Ultimage any advantage over 
IPLab. 


reading the excellent and thorough man¬ 
ual. The defaults set up for you will gen¬ 
erally be the ones that you want and 
need, but you can tailor the system as de¬ 
scribed below. Starting up Power Disk 
puts you into interactive mode, where 
you first select the disk you want to work 
with. The characteristics for that drive 
are then presented in summary form 
along with the menu. Menu choices in¬ 
clude analyze only, disk explorer, reor¬ 
ganize the disk, select a new disk, mod¬ 
ify the program options, and quit. 

The analyze only option provides a 
complete summary of the number of files 


August 1990 


107 







on the disk, the total bytes used, how 
many fragmented files, how much empty 
space, percentage of disk used, number 
of systems, hidden and read-only files 
along with number of unmovable frag¬ 
ments in these files, and the number of 
bad block clusters. A key press provides 
a graphical disk map of this information, 
and one more key press offers you the 
ability to respond “yes” to testing the 
disk for unreliable clusters. A “no” takes 
you back to the main menu. 

The next menu selection provides for 
the disk defragmenting. While it is oc¬ 
curring, you see dynamically what is 
going on either by watching the R and W 
characters rapidly moving across the 
map, or by turning the map off and 
watching the names of the files being 
moved. Since the map is scaled to fit on¬ 
screen (each block is actually many clus¬ 
ters), a zoom-in feature allows you to re¬ 
duce the number of clusters per block for 
more detailed viewing (of course, you 
can also zoom out). 

The reporting options menu item con¬ 
tains a submenu that lets you specify the 
type of reports you want (from none to 
quite detailed), the file handling options, 
the protection level, and the reorganiza¬ 
tion strategy. File handling is unique in 
that it lets you delete empty files or di¬ 
rectories and move hidden and read-only 
files. I appreciate the last feature, since 
some of the software I use creates hid¬ 
den files that, although not unmovable, 
are not normally moved by the other de¬ 
fragmenters I have used. 

The three protection levels allow you 
(1) to defragment and verify that the data 
moved is the same while being protected 
against power failure, (2) to skip the 
verification but still protect, or (3) to 
skip both the verification and protection 
(the fastest and adequate if you have, 
like I do, an uninterruptable power sup¬ 
ply to handle power failures). 

Reorganization strategies include (a) 
speedy reorganization to simply close up 
empty spaces, (b) a more modest one to 
put fragmented files last, since in all 
likelihood they will continue to fragment 
and thus have less effect in the future on 
your disk’s organization, (c) a traditional 
one to put your files in directory order, 
or (d) a least-amount-of-work option to 
simply defragment as many files as pos¬ 
sible into the empty areas without clos¬ 
ing up the empty spaces. 

Since all of these options can be speci¬ 
fied by switches after the program name, 
you can run Power Disk from a batch 
file. And the program’s speed makes it 
reasonable to put a command into your 
autoexec.bat file and run the program 
every time you reboot. In fact, you can 
even specify under what conditions you 
want Power Disk to run (for example. 


only if badly fragmented) and the days of 
the week to run it. 

Another unusual feature of Power 
Disk, the disk explorer, lets you see the 
fragmented disk space and the location of 
directories and files (either with wild¬ 
cards to see all files matching a given de¬ 
scription or unique file names in a speci¬ 
fied directory). So, if you’ve ever won¬ 
dered, as I have, where a file resides on 
the disk. Power Disk will show it to you 
graphically and by cluster numbers. This 
is particularly useful if you want to move 
so-called unmovable files but don’t know 
which ones they are. To do so, you sim¬ 
ply move the cursor in the disk map until 
it is on top of the unmovable file (indi¬ 
cated by a U), then press a function key 
to learn the file name and its extent. 

What does it take to use Power Disk? 
First of all, you will need some free disk 
space — a minimum of only one cluster. 
But, for maximum reorganization speed 
you’ll need 64 Kbytes. System require¬ 
ments are a PC compatible, DOS 2.0 or 
later, a graphics adapter, and 256 Kbytes 
for partitions less than 32 Mbytes or 384 
Kbytes for partition sizes larger than that. 

The suggested price is $79.95 with a 
30-day money-back guarantee. Now, you 
can surely get a similar capability with 
one of the popular disk utility packages 
(Norton or PC Tools) and not have to pay 
extra for what you get here. But, frankly, 
you get what you pay for in those other 
products — nowhere near the speed or 
the flexibility of Power Disk. For ex¬ 
ample, running Windows 3.0 with a tem¬ 
porary swap file, I find Power Disk quite 
important. Defragmenting the disk gives 
me a much larger swap file. But I am not 
willing to wait the 10 minutes or more 
that my old defragmentor required before 
I could use my machine productively. 

Thus, the little extra this software costs is 
well worth the savings in my time. I 
heartily recommend this product. 

Contact Multisoft at 15100 SW Koll 
Parkway, Beaverton, OR 97006, phone 
(800) 274-KWIK or (503) 644-5644 . 

— R. Eckhouse 


Morefonts. This is the second time 
I’ve had a chance to review Morefonts 
from Micrologic Software. My first re¬ 
view ( Computer , Vol. 22, No. 10, Oct. 
1989, pp. 77-79) praised this outstanding 
product. After all, for $100 (now $150) 
you got enough soft fonts, plus a menu- 
driven control program for your HP Las¬ 
erJet, to last you several lifetimes. With 
version 1.11, the company added a sig¬ 
nificant number of new features. In¬ 
cluded are support for WordPerfect 5.1, 
First Publisher, and Multimate 4. Symbol 
sets have been changed so that you can 
edit them for complete control of the 


characters in any font. And you can add 
reverse backgrounds to the fonts, for a 
total of more than 50,000 possible com¬ 
binations of fill patterns and effects. 

But let’s look at what Morefonts is all 
about. First, it comes with 14 scalable 
typeface outlines: Geneva (a version of 
Helvetica) in normal, italic, bold, and 
bold italic; Tiempo (a version of Times 
Roman), also in four variations; Pag¬ 
eant; Opera; Poster; Showtime; Bur¬ 
lesque; and Financial. You get three 
more for registering the product. The 
font fill patterns include Foggy, Sunset, 
Fountain, Woodgrain, and shades of 
gray. Add to that the font outlines (thin, 
thick, calligraphy, and contour) and 
shadows (drop right, drop left, 3D, etc.) 
and mix in the backgrounds (white, 
black, none, rattan, starburst, to name a 
few), and the results are spectacular. 

Installation is really straightforward. 

A program called “Hello” will automati¬ 
cally install Morefonts on your hard disk 
(not required but recommended if you 
want to generate and store lots of fonts). 
At this point you can also configure the 
program for your system. To make it all 
work, you will need 384 Kbytes of 
memory and DOS 2.0 or later, as well. 
And you don’t need to read the manual 
that comes with this package because 
the on-line help is generally sufficient. 

I tested Morefonts with Windows 3.0, 
Ami Professional, and a Kyocera F- 
1000A laser printer. I needed a larger 
type than the Times Roman 10-point 
that comes built in on this laser printer. 
To generate all the fonts I wanted, I 
specified their characteristics, stored 
them as a font set descriptor, and in¬ 
stalled them into Windows (the program 
automatically modifies win.ini to indi¬ 
cate the new fonts). Next, I installed the 
fonts within Windows using the control 
panel. Thus, within minutes I had 12- 
point Times Roman (actually Tiempo) 
output on my laser, a painless process. 

Morefonts supports many other appli¬ 
cations, but I did not test them this time. 

A new release of this product, version 
1.2, is now out, and it includes support 
for the scalable printer font feature of 
the HP Laserjet III. You can also pur¬ 
chase additional fonts besides the stan¬ 
dard 17. So whether your needs are 
mundane — to supplement the limited 
number of fonts that came with your 
Laserjet — or colorful — to spice up a 
poster or overhead slide — this product 
will satisfy them easily. I rate this a 
“must have” font utility that will easily 
pay for itself when used with just a few 
of your critical applications. 

Contact Micrologic Software, 6400 
Hollis St., Suite #9, Emeryville, CA 
94608, phone (415) 652-5464, fax (415) 
652-7079 . — R. Eckhouse 


COMPUTER 










Treesaver from Discoversoft is billed 
as “photo reduction software” for your 
Laserjet Plus and series II printers (and 
compatibles). That means it can take nor¬ 
mal laser printer output and compress it 
so that two or four pages print on a 
single side of paper. In all respects, Tree- 
saver provides an easy means to reduce 
the amount of printer paper you use. 

From copying the files from the distribu¬ 
tion disk and configuring the software (if 
necessary) to producing useful results, 
this product works and works well. If 
you need help, the manual is very com¬ 
plete and well laid out and includes both 
a table of contents and an index. 

Configuring Treesaver is a matter of 
cycling through the setup panels. In one 
you specify the Laserjet model and paper 
size, the number of copies, manual feed, 
and various font and symbol parameters. 
Fonts can be internal, soft, or external. 
Treesaver supports more than 20 symbol 
sets using a separate default font panel. 
Another panel lets you select the printer 
port, startup mode (compressed or not), 
TSR hot key, fields related to graphics 
and macro buffer size, and scale plus in¬ 
dent control for printing in scaled mode. 

Treesaver provides a great deal of 
flexibility in that double mode prints in 


the opposite orientation from your origi¬ 
nal document. Thus, two full-size pages 
in portrait mode come out as one double¬ 
mode page in landscape. Alternatively, 
two double-column, full-size pages in 
landscape mode come out as one double 
page (four columns) in double mode. 

Another unusual feature is that Tree¬ 
saver will accept graphic images and 
scale them as well. That’s why you need 
to specify the size of the graphics buffer 
during setup. Also, if you use the 
LaserJet’s macro capability, you need to 
allocate some memory within Treesaver 
for holding macros. 

Since Treesaver is a RAM-resident 
program, it is active and able to intercept 
output to your laser printer at any time. It 
also monitors your keystrokes so you can 
enter commands from the keyboard in 
the form of Alt-* where x is 1, 2, 3,4, P, 
or L for single, double, scaled, quad, por¬ 
trait, or landscape mode, respectively. 
You can also insert “wake-up” codes of 
the form Esc-* in your text. They will in 
turn enable the appropriate print mode. 
These were specifically selected to make 
sure they did not conflict with normal 
Laserjet escape codes. 

What does all this cost you in terms of 
memory space? Basically, it takes 18 


Kbytes plus whatever you allocate to the 
graphics and macro buffers. If you have 
expanded memory on your machine, 
Treesaver loads itself there. I ran it under 
Windows 3.0 with a 24-Kbyte graphics 
buffer, and it loaded in LIM memory 
with just a nibble (less than 1 Kbyte) out 
of main memory. No special hardware 
requirements exist, except the Laserjet, 
so any PC or compatible will do as long 
as you use DOS 2.1 or higher. 

Compressing page images so that more 
than one can fit on a page is not new. A 
couple of shareware programs I know of 
will do a similar job. I prefer one of the 
shareware programs because it offers 
more options that I desire (such as the 
ability to print both sides, page number 
and titling, borders, and selection of 
which pages to print). Also, the share¬ 
ware program is less than half the price 
of Treesaver, which sells for $89.95. So, 
my suggestion is to shop around after 
you determine your needs. Treesaver has 
much to recommend it, particularly those 
features that aren’t available in the other 
products. 

Contact Discoversoft at 1516 Oak St., 
Alameda, CA 94501, phone (415) 769- 
2902, fax (415) 769-0149. 

— R. Eckhouse 


1991 International Symposium on Multiple-Valued Logic 


PROGRAM CO-CHAIRS 

AMERICAS 
Dr. Wayne Current 
Elect. Eng. & C.Sc. Dept. 
University of California 
Davis, CA 95616 
(916) 752-1839 or 0583 

EUROPE / AFRICA 
Dr. Michel Israel 
IIE-CNAM 

18, Allee Jean Rostand 
BP 77, 91002 Evry Cedex 
France 

+33 (1) 60 77 97 40 

ASIA / PACIFIC 
Prof. Okihiko Ishizuka 
Dept, of Electronic Eng. 
Miyazaki University 
Miyazaki-Shi 889-21 Japan 
(81)985-58-2811 


CALL FOR PAPERS 

The Multiple-Valued Logic Technical Committee of the IEEE Computer 
Society will hold its 21st annual Symposium on May 26-29,1991 in Victoria, 
Canada. The Symposium is sponsored by the IEEE Computer Society, The 
University of Victoria, the Natural Sciences and Engineering Research 
Council of Canada, and the Association for Symbolic Logic. You are invited 
to submit an original research, survey, or tutorial paper on any subject in the 
area of Multiple-Valued Logic Authors are requested to submit five copies 
(in English) of their double-spaced typed manuscript on 8.5 by 11 inch or A4 
paper by November 1, 1990. Each paper should include a 50 - 100 word 
abstract. Please submit full addresses, telephone and fax numbers, email ad¬ 
dresses, etc. for all authors. Papers should be sent to the closest Program 
Chair. Authors will be notified by February 1,1991. Photo-ready copies of 
accepted papers are due by March 1, 1991. 

For further information contact Dr. D. M. Miller, ISMVL-91 Chair, Depart¬ 
ment of Computer Science, University of Victoria, Canada V8W 2Y2 (604) 
721-7220, email: dmill@csr.uvic.ca, fax: (604) 721-7292. 

<M> UVic 


U IEEE COMPUTER SOCIETY 


11 


August 1990 


109 












CAREER OPPORTUNITIES 


RATES: $12.00 per line, (ten lines mini¬ 
mum). Average five typeset words per 
line, eight lines per column inch. Add 
$10 for box number. Send copy at least 
one month prior to publication date to: 
Marian B. Tibayan, Classified Adver¬ 
tising, COMPUTER Magazine, 10662 
Los Vaqueros Circle, PO Box 3014, 
Los Alamitos, CA 90720-1264; (714) 
821-8380; fax (714) 821-4010. 


THE UNIVERSITY OF KENTUCKY 

The University of Kentucky is seeking ap¬ 
plications and/or nominations to fill an 
endowed chair in the area of Computer En¬ 
gineering. The Robinson Chair in Computer 
Engineering requires a nationally recognized 
scholar and researcher in computer engi¬ 
neering with a Ph.D. in electrical and/or 
computer engineering. The candidate must 
have a record of proven experience which 
will enable him/her to provide the leadership 
necessary to elevate the University of Ken¬ 
tucky to nationally recognized status in this 
area. It is expected that a Center of Excel¬ 
lence in Computer Engineering will develop 
out of these efforts. In order to achieve these 
goals, the appointee will establish and main¬ 
tain a strong program of research, including 
extramural funding, develop and teach com¬ 
puter engineering courses at the graduate 
and undergraduate levels, and provide lead¬ 
ership and mentorship to faculty members 
and graduate students in this area. The posi¬ 
tion carries a salary which is highly competi¬ 
tive with today’s market, six faculty positions 
specifically designated in Computer Engi¬ 
neering, excellent department and university 
computing facilities including a supercompu¬ 
ter and a host of other computers and work 
stations linked by a compus-wide data com¬ 
munications network, necessary laboratory 
space and equipment and above all, strong 
university administrative support for the 
above mentioned goals for Computer Engi¬ 
neering. The Center for Computational Sci¬ 
ence, University Computing Center and 
Computer Science Department are included 
in supporting units with key interest in these 
plans. Applications and/or nominations 
should be sent to Dr. S. A. Nasar, Chair¬ 
man, Department of Electrical Engineering, 
University of Kentucky, Lexington, KY 
40506-0046. Additional information may 
also be obtained by writing to the above ad¬ 
dress or calling at (606) 257-8042. The Uni¬ 
versity of Kentucky is an equal opportunity/ 
affirmative action employer. 


ROBOTICS 

The McGill Research Centre for Intelligent 
Machines has been constituted as an inter¬ 
disciplinary grouping of researchers in Com¬ 
puter Science, Electrical Engineering, 
Mechanical Engineering and Biomedical 
Engineering. It consists of three groups: the 
Computer Vision and Robotics Laboratory, 
the Robotic Mechanical Systems Labora¬ 
tory, and the Systems and Control Group. 
There are 17 professors and about 100 grad¬ 
uate students in the Centre. The Centre 
possesses outstanding facilities for research 
and provides an excellent academic milieu. 


The Centre is seeking to fill two tenure 
track positions at the Assistant Professor level 
specializing in Sensor-Based Robotics 
One of these positions is in the Department 
of Electrical Engineering and the other in the 
Department of Mechanical Engineering. It is 
expected that the candidate will be inte¬ 
grated into one of the existing sub-groups in 
the Centre. The candidate should be a re¬ 
cent doctoral graduate with a strong interest 
in research and teaching. We are looking for 
outstanding candidates who see themselves 
as potential leaders in the field. Excellent 
communication and teaching skills are a 
must. An interest in industrial collaborative 
projects is required. 

In accordance with Canadian Immigration 
requirements, this position is offered in the 
first instance to Canadian citizens and per¬ 
manent residents. 

Interested people should send their Cur¬ 
riculum Vitae and the names of three 
referees to: Professor Martin D. Levine, 
Director, McGill Research Centre for In¬ 
telligent Machines, McConnell Engineering 
Building, 3480 University Street, Montreal, 
Quebec, H3A 2A7. 


VAX SYSTEMS ANALYST 

Install and support our on-line computer 
system; design DECNET Communication 
networking; general system management; 
direct interaction with corporate managers 
and employees in the analysis, design and 
implementation of the VAX system. 

Master of Science in Computer Science 
with one year of experience in DECNET- 
VAX programming, VAX system manage¬ 
ment, and VAX/PDP-11 assembly using 
FMS. Yearly Salary: $33,000. Apply at the 
Texas Employment Commission, Dallas, 
Texas, or send resume to the Texas Employ¬ 
ment Commission, TEC Building, Austin, 
Texas 78778, J.O. * 5424709. Ad Paid by 
an Equal Employment Opportunity 
Employer. 


COMPUTER SCIENCE 
FACULTY POSITION 

The Computer Science Department in¬ 
vites applications for a tenure-track position 
at the Assistant Professor level for Spring, 
1991. Qualifications include a doctorate in 
Computer Science or related area. Candi¬ 
dates should be qualified in one or more of 
the following: systems programming, com¬ 
pilers. programming languages, data base, 
architecture, parallelism, artificial intelli¬ 
gence, or software engineering. The pro¬ 
gram offers bachelor's and master’s degrees, 
with over 250 undergraduate majors and 
minors and 30 master’s candidates. Com¬ 
puter facilities include: IBM 4381, VAX 
6320, PDP 11/44 (UNIX), TI Explorer, 
125+ microcomputers, expert system, 
CASE and CBE tools, UCCP and BITNET. 
WKU is located in Bowling Green, KY, 2 
hours south of Louisville and 1 hour north of 
Nashville, TN, with a population of 56,000. 
Housing costs are less than 90% of the na¬ 
tional norm. Review of applications will 
begin September 30, 1990. Send letter of 
application, vita, and names of at least three 


references to: Office of Academic Affairs, 
Computer Science Search, Western Ken¬ 
tucky University, Bowling Green, KY 
42101. Women and minorities are en¬ 
couraged to apply. An Affirmative Action, 
Equal Opportunity Employer. 


UNIVERSITY OF BRITISH COLUMBIA 
Faculty Position 

The Department of Electrical Engineering, 
University of British Columbia invites appli¬ 
cations for a tenure-track appointment as 
Assistant or Associate Professor in Com¬ 
puter Engineering. Areas of interest include 
hardware architecture, parallel/concurrent 
systems, software engineering, fault-tolerant 
systems, real-time systems, and image based 
measurement systems. 

A Ph.D. is required. Industrial and/or 
teaching experience would be useful. The 
successful applicant would be expected to 
pursue research vigorously, and to teach at 
the graduate and undergraduate levels. Col¬ 
laboration with the Department of Computer 
Science is facilitated through the Centre for 
Integrated Computer Systems Research. 

Salary is commensurate with qualifica¬ 
tions and experience. Start-up funding is 
available, for purchase of equipment and 
support of graduate student research as¬ 
sistants. The position is available immediate¬ 
ly. Priority will be given to applications 
received on or before October 31, 1990. 

To apply, send curriculum vitae, reprints 
of published papers, and names of at least 
three references, and state eligibility for 
employment in Canada to: 

Dr. R.W. Donaldson, Head 
Department of Electrical Engineering 
The University of British Columbia 
2356 Main Mall 
Vancouver, B.C. 

Canada V6T 1W5 

The University of British Columbia is com¬ 
mitted to the Federal Government’s employ¬ 
ment equity program and encourages appli¬ 
cations from all qualified individuals. In 
accordance with Canadian Immigration re¬ 
quirements, priority will be given to Cana¬ 
dian citizens and permanent residents of 
Canada. 


ACQUISITIONS EDITOR 

Academic Press, an international pub¬ 
lisher of scientific and technical books and 
journals, seeks applicants for an engineering 
editorial position with emphasis on computer 
science and with a secondary background in 
physical science. This person will be respon¬ 
sible for contacting potential authors to 
acquire manuscripts for new books and jour¬ 
nals. Applicants must have publishing ex¬ 
perience in science or a technical area, or 
graduate education in engineering and com¬ 
puter science. The editorial position offers 
comprehensive benefits and an excellent 
starting salary. Please send detailed cover 
letter, resume and salary requirements to: 

Academic Press 
Personnel Department (AEB) 

1250 Sixth Avenue 
San Diego, CA 92101 
EOE/MFVH 
















CALL FOR PAPERS 

1991 ACM International Conference on 
Supercomputing 


ac 


June 17-21, Cologne, Germany 


Sponsored by ACM-SIGARCH in association with AICA, CSRD, CTI, CWI, GI, 
GMD, INRIA, IPSJ, KFA, SBMAC, and SIAM-SIAGS. 

Conference Co-Chairmen 

Edward S. Davidson, University of Michigan, Ann Arbor, USA 
Friedel Hossfeld, Research Center Juelich (KFA), Germany 

Program Director 

Yoichi Muraoka, Waseda University, Tokyo, Japan 


The fifth International Conference on Supercomputing is soliciting papers on significant 
new research results in the development and use of supercomputing systems. 
Contributions should emphasize the novel aspects of the work being reported and should 
discuss their implications for future supercomputer development. Papers are solicited in 
the following areas: 


Architectural Design of Supercomputer Systems. 
Software Systems Support for Supercomputing. 
Applications of Supercomputing. 
Supercomputing Algorithms and Performance Analysis. 


Submissions 

Conference Proceedings will be published by ACM. Authors should send five copies of 
the full manuscript to the program chairman of their region. The deadline for 
submissions is December 1, 1990. Authors will be notified of acceptance by February 20, 
1991. Final versions of accepted submissions will be due by March 20, 1991. The 
addresses for submissions are: 


Europe and Africa: 

Ulrich Trottenberg 
GMD/F1 

Schloss Birlinghoven 
D-5205 Sankt Augustin 1 
Germany 


North and South America: 

Elias Houstis 
Dept. Computer Science 
Purdue University 
W. Lafayette, IN 47907 
USA 


Japan and the Far East: 

Toshitugu Yuba 
ETL 1-1-4 Umesono 
Tukuba, Ibaraki 
305 Japan 


Inquiries can be directed to: Ruediger Esser, KFA-ZAM, D-5170 Juelich, Germany 
email: zdv003 at djukfal l.bitnet; phone: +49-2461-61-6588; fax: +49-2461-61-6656 





CONFERENCES 


Editor: Edmund L. Gallizzi, Computer Science Dept, Eckerd College, St. Petersburg, FL 33733; (813) 864-8272; Compmail+ e.gallizzi 


Build defect-free software, Fagan urges 


Ware Myers, Contributing Editor 

“We know how to build defect-free 
software,” Michael E. Fagan told 500 
software developers and testers at the 
Seventh International Conference on 
Testing Computer Software June 19 in 
San Francisco. The key to this achieve¬ 
ment is “continuous incremental process 
improvement.” 

The conference, though focused on 
testing, had presentations covering the 
entire range of improvements to the soft¬ 
ware process — formal inspections, cor¬ 
rectness verification, human factors, and 
management aspects. 

The three-day meeting, managed by 
the US Professional Development Insti¬ 
tute, was sponsored by the Data Process¬ 
ing Management Association Education 
Foundation in cooperation with the 
American Society for Quality Control, 
the ACM, and the International Test and 
Evaluation Association. 

The program was organized by David 
Gelperin of the consulting firm Software 
Quality Engineering and Genevieve 
Houston-Ludlam of Frontier Technolo¬ 
gies. 

Fagan developed a formal-inspection 
methodology at IBM in the mid-1970s 
that was published in the IBM Systems 
Journal in 1976* and came to be known as 
Fagan inspections. Since then, the 
method has been used within IBM and by 
some other organizations. Fagan left 
IBM in January 1989 and is now convey¬ 
ing his methodology to the software 
world at large. 

The underlying problem is doing 
things wrong the first time through the 
software process. Designing and coding 
from incomplete requirements can halve 
productivity, Fagan said, by necessitat¬ 
ing multiple trips through the software 
stages. Moreover, changes in design and 
code caused by changes or additions to 
the requirements generally exhibit about 


* M.E. Fagan, “Design and Code Inspections to Re¬ 
duce Errors in Program Development,” IBM Systems 
7, Vol. 15, No. 3, 1976, pp. 219-248. 


three times the defect rate of the base de¬ 
sign. He estimated that the effort devoted 
to reworking defects reduces develop¬ 
ment productivity by 30 to 60 percent. 

Inspection process. The first step to¬ 
ward correcting these problems and ob¬ 
taining the benefits of improved quality 
and productivity is to formally inspect 
the product at every stage of the software 
process — requirements, high-level de¬ 
sign, low-level design, code, test plan, 
test cases, and user documentation. 

Fagan outlined a six-step inspection 
process consisting of 

(1) planning, 

(2) group education, 

(3) individual preparation to fulfill as¬ 
signed roles (moderator, author, 
readers, and tester) in the inspec- 

(4) the actual inspection — finding de¬ 
fects, 

(5) reworking these defects, and 

(6) verifying that the defects have 
been resolved. 

There is a “hard” relationship between 
quality and productivity. “If we improve 
quality during development, what shows 
up is an improvement in productivity as 
well,” he asserted. 

He cited a productivity increase of 25 
percent following the introduction of for¬ 
mal inspections on a small Aetna Life and 
Casualty project (actually a reduction of 
25 percent from person-days projected 
by the company’s estimating method to 
actual person-days recorded). At the 
same time, the inspections uncovered 82 
percent of the errors; unit test found 18 
percent; and no errors turned up on accep¬ 
tance test or during the first two years of 
operation. 

A small IBM project in the United 
Kingdom found 93 percent of the defects 
by inspection, with a 9-percent gain in 
productivity. The product had zero de¬ 
fects at acceptance testing. 

Inspections on two Unisys projects 


found 68 and 70 percent of defects prior to 
test with net savings of $483,300 and 
$397,945, respectively. 

More broadly, with inspections imple¬ 
mented to an initial goal level character¬ 
ized by brief, informal training of the in¬ 
spection-meeting moderator, Fagan esti¬ 
mated that 60 percent of defects are found 
before testing, accompanied by a 10 per¬ 
cent gain in productivity. Carried to the 
second goal level — more formal training 
— 90 percent of defects are found and 
productivity improves by 25 percent. 

Still, to gain these benefits manage¬ 
ment must accommodate a shift in the 
utilization of human resources. It must 
plan to employ more people at the front 
end of the software process. Fagan esti¬ 
mated that formal inspections (and the re¬ 
sulting immediate rework) move about 
15 percent of the net resources (person- 
hours) up front. On the plus side, the use 
of inspections reduces overall person- 
hours by from 10 to 40 percent and at the 
same time shortens the tail end of the 
schedule. 

Continuous improvements. But for¬ 
mal inspections are only a first step to¬ 
ward the “defect-free process.” The next 
step is to improve the process itself by 
“continuous incremental process im¬ 
provement.” 

Indicators of process inadequacies, 
according to Fagan, include 

(1) lack of standards, or even descrip¬ 
tions, governing the process, 

(2) unfamiliarity of personnel with 
these descriptions or standards, 

(3) failure to involve development 
personnel in developing process 
descriptions and standards, 

(4) failure of entry criteria to match the 
exit criteria of the previous process 
stage, 

(5) absence of documented agree¬ 
ments with suppliers or customers, 

(6) lack of measurements, and 

(7) a change control method that fails 
to keep all concerned informed. 


112 


COMPUTER 










To correct process inadequacies, 
Fagan would have software organiza¬ 
tions formally define each stage of their 
process. Entry into the stage would be re¬ 
stricted to products meeting specific cri¬ 
teria; exit from the stage would be gov¬ 
erned by exit criteria. 

Examples of exit criteria from the 
high-level design stage are 

(1) design satisfies product require¬ 
ments, as well as the objectives of high- 
level design itself, 

(2) control flow is carried down to the 
module name level, and 


(3) rework from the inspection follow¬ 
ing high-level design has been completed 
and verified. 

These criteria are a form of measure¬ 
ment. Furthermore, when the formal in¬ 
spection methodology reveals inadequa¬ 
cies in the process, they would be cor¬ 
rected and the improvement would be 
substantiated by the measurement tech¬ 
nique. 

As process improvement proceeds to 
higher and higher levels of capability, 
Fagan has seen software produced that 
was “defect-free,” in the sense that no de¬ 


fects were found in the first period of cus¬ 
tomer operation. This high quality was 
accompanied by reductions in develop¬ 
ment effort in the 40 to 50 percent range 
and in the development time schedule. 

“Introducing the ‘defect-free process’ 
into your organization,” Fagan con¬ 
cluded, “requires (1) formal process defi¬ 
nition of key operations, (2) rigorous exe¬ 
cution of the formally defined process, 

(3) Fagan inspections of requirements, 
design, and code, (4) and participation in 
continuous process improvement by 
managers and developers — because 
they want to!” 


Keynoter challenges attendees at multiple-valued logic symposium 

Gerhard Dueck, St. Francis Xavier University, Nova Scotia 


C. Michael Allen of the University of 
Charlotte issued his audience a challenge 
when he delivered the keynote address at 
the 20th International Symposium on 
Multiple-Valued Logic May 23 to 25. 

Speaking the opening day on “Multi- 
Valued Logic: Wave of the Future or An 
Historical Anachronism,” Allen pointed 
to problems associated with binary sys¬ 
tem and then urged the attendees to find 
multivalued solutions to the problems. 

The event was the 20th in the ISMVL 
series, with George Epstein of the Uni¬ 
versity of North Carolina serving as 
chair. Held in Charlotte, North Carolina, 
the symposium was cosponsored by the 
IEEE Computer Society, the society’s 
MVL Technical Committee, the Univer¬ 
sity of North Carolina at Charlotte, the 
Association for Symbolic Logic, and the 
Microelectronics Center of North Caro¬ 
lina. 

Beyond binary. Michitaka Kame- 
yama of Tohoku University in Japan pre¬ 
sented the first invited address, entitled 
“Toward the Age of Beyond-Binary 
Electronics and Systems.” Kameyama 
attributed the limitation in submicron 
VLSI to the interconnection problem. 
Devices are extremely fast, but intercon¬ 
nections may cause delays almost an or¬ 
der of magnitude higher. 

Most of the chip area is dedicated to the 
transmission of signals. The number of 
interconnections can be reduced with 
multiple-valued signals. This will reduce 
the chip area as well as the delays. In addi¬ 
tion, a chip using multiple-valued signals 
would have lower power dissipation and 
a reduction in crosstalk noise. 

According to Kameyama, hardware 


algorithms based on multiple-valued 
data representation must be developed to 
demonstrate the superiority of multiple¬ 
valued logic over binary logic systems. 
Some examples of where this has been 
achieved are signed-digit arithmetic cir¬ 
cuits, residue-arithmetic circuits, and 
image-processing chips. In signed-digit 
arithmetic, carry propagation is limited 
to one position — ideal for parallel opera- 

Multiple-valued bidirectional current¬ 
mode circuits are well suited for their im¬ 
plementations. Japanese researchers 
have fabricated a 32-by-32 bit multiplier 
chip, which internally uses radix 4, 
signed-digit number representation. 

Chip area and power dissipation are half 
that of the corresponding binary CMOS 
multiplier. The speed is comparable to 
the fastest binary multiplier. 

Kameyama described bio-devices as a 
fertile area of research for MVL imple¬ 
mentations. Biomolecular computing 
may provide interconnection-free logic 
operations based on parallel distribution 
of logic information represented by va¬ 
rieties of molecules and parallel selec¬ 
tion using specificity of enzymes. En¬ 
zymes are highly specific in the choice of 
reactants, called substrates. The source 
broadcasts substrates in solution and the 
destination detects their presence using 
enzyme sensors. Parallel logic systems 
can be realized without specifying the to¬ 
pology of the network. 

“The basic research on MVL-oriented 
devices such as bio-devices is essential to 
realize beyond-binary electronics and 
systems,” Kameyama said. “Such chal¬ 
lenges for the 1990s will be to establish a 
significant role for MVL approaches in 


practical applications,” he concluded. 

Invited speakers. Melvin Fitting of 
the City University of New York deliv¬ 
ered the second invited address, “Bilat¬ 
tices in Logic Programming.” Fitting pre¬ 
sented background from logic program¬ 
ming and bilattices. He showed that bilat¬ 
tices are both natural and useful tools for 
logic programming and are well worth 
pursuing for this reason. 

R.S. Michalski of George Mason Uni¬ 
versity gave the final invited address, 
“Theory of Plausible Reasoning — Foun¬ 
dations and Methodology.” 

An outstanding paper award for 
ISMVL 89 was presented to T. Aoki, M. 
Kameyama, and T. Higuchi for “Design 
of a Highly Parallel Set Logic Network 
Based on a Bio-Device Model.” 

The proceedings of ISMVL 90 include 
the 63 contributed papers, plus the first 
two invited addresses. As in the past, the 
contributed papers encompass a broad 
spectrum of multiple-valued research ar¬ 
eas: circuits, algebra, spectral tech¬ 
niques, logic, fuzzy logic, logic design, 
artificial intelligence, expert systems, 
and special applications. 

Copies of the proceedings, order No. 
2046, are available from the IEEE Com¬ 
puter Society Press, Los Alamitos, Cali¬ 
fornia, by calling (800) CS-BOOKS or 
(714) 821-8380 in California. 

The University of Victoria in British 
Columbia, Canada, will host the 1991 
symposium. For further information, 
contact the symposium chair, D.M. 
Miller, Department of Computer Sci¬ 
ence, University of Victoria, PO Box 
3055, Victoria, B.C. V8W 3P6, Canada. 

ISMVL 92 will be held in Japan. 


August 1990 


113 





Qualitative simulation and automatic 
model-building is focus of keynoter 

Hrishikesh P. Gadagkar and Mukul V. Shirvaikar, University of Tennessee 


Qualitative simulation and automatic 
model-building are two challenging 
problems that can be successfully solved 
using qualitative reasoning techniques. 
That is the strong belief of Benjamin Kui- 
pers from the University of Texas at 
Austin, the first of three keynote speakers 
at the Applications of Artificial Intelli¬ 
gence VIII Conference held in Orlando, 
Florida, April 17 to 19. 

Bernard Widrow of Stanford Univer¬ 
sity and Elliot Soloway of the University 
of Michigan also delivered keynote talks 
at the event, sponsored by the Interna¬ 
tional Society for Optical Engineering in 
cooperation with the IEEE Computer So¬ 
ciety and the IEEE Systems, Man, and 
Cybernetics Society. Mohan Trivedi of 
the University of Tennessee, Knoxville, 
served as chair of the program/confer¬ 
ence committee. 

Kuipers spoke on “The Use of Qualita¬ 
tive Simulation in Support of Model- 
Based Reasoning.” He said that auto¬ 
matic model-building is difficult to for¬ 
malize and that relatively little research 
has been reported in the field. 

Kuipers stressed how qualitative mod¬ 
eling and simulation methods can be used 
for model-based systems that utilize in¬ 
complete quantitative knowledge. Fur¬ 
ther, he added that qualitative differen¬ 
tial equations were successfully used in 
eliminating spurious predictions during 
simulation. However, realistic complex 
systems will need to be broken down into 
simpler mechanisms before it will be pos¬ 
sible to qualitatively simulate them. 

Systems pioneer. Widrow, who pio¬ 
neered advances in adaptive signal pro¬ 
cessing systems three decades ago, gave a 
talk entitled “The Truck Backer-Upper: 
An Example of Self-Learning in Neural 
Networks.” In it, Widrow gave an inter¬ 
esting and informative overview of the 
key ideas underlying neural network ap¬ 
plications, highlighting the diverse ap¬ 
plication areas of this burgeoning field. 

Widrow graphically demonstrated the 
use of neural networks to design highly 
nonlinear controllers using test cases 
such as the truck backer-upper. The de¬ 
sign of such controllers is not possible us¬ 
ing standard control theory techniques. 

Self-learning techniques were utilized 
to determine the internal parameters of 
the nonlinear controller. The present 
technique is, however, limited to training 
problems involving only a small number 
of degrees of freedom. 

Soloway presented an enlightening 


talk on how learning skills of young stu¬ 
dents can be enhanced by integrating 
computers as a key feature in their educa¬ 
tional environment. 

In his talk, entitled “Highly Interactive 
Computing Environments: The Next 
Wave,” Soloway explained how early in¬ 
troduction of computers in student cur¬ 
ricula provides a powerful tool for tap¬ 
ping the creativity in young students and 
allows them to understand how complex 
tasks can be programmed by breaking 
them down into smaller modules. 

Highlighting the rapid progress in 
computer technology. Soloway argued 
that the environment in which young stu¬ 
dents are functioning today is completely 
different from the environment in which 
most of today’s educators were trained. 
Therefore, he said, this poses a challenge 
to us in finding the appropriate models 
and mechanisms for the best educational 
experience we can provide young stu¬ 
dents. 

According to Soloway, the present 
educational system falls short in convey¬ 
ing the purpose or eventual utility of 
many skills taught. By narrating some 
of his personal interactions with young 
students while using his technique of 
teaching them the concepts of computer 
programming. Soloway explained how 
he was able to overcome some of the 
shortcomings of the present educational 
system. 

At times humorous, Soloway’s talk 
was quite thought provoking and had a se¬ 
rious undertone. 

Technical program. The technical 
papers presented during the conference 
covered AI application domains such as 
two- and three-dimensional computer vi¬ 
sion, expert systems, neural networks, 
diagnostic systems, aerospace applica¬ 
tions, semiconductor manufacturing sys¬ 
tems, and various aspects of robotics. 

The conference allowed researchers 
from academia and various research 
laboratories to familiarize themselves 
with the research interests of technical 
speakers from private industry. The 
unique feature of this conference is to 
emphasize the practical and engineering 
aspects associated with the use of AI tech¬ 
niques. 

The conference proceedings are avail¬ 
able through SPIE, Bellingham, Wash¬ 
ington (Vol. 1293). The next event in the 
series will be held in Orlando, Florida, in 
April 1991. Further information can be 
obtained by calling (206) 676-3290. 


/fWj IEEE Micro seeks manuscripts for gen- 
eral-interest issues in 1991. Topics of 
particular interest range from artificial intelli¬ 
gence and biological computing to VHDL de¬ 
sign and workstations. Submit manuscript to 
Joe Hootman, EE Dept., Univ. of North Da¬ 
kota, PO Box 7165, Grand Forks, ND 58202, 
phone (701) 777-4331. 

1991 IEEE Computer Society VLSI 
Workshop: Feb. 1991, Orlando, Fla. 
Sponsor: IEEE Computer Society Technical 
Committee on VLSI. Submit paper to Len Ber¬ 
man, IBM T.J. Watson Research Center, PO 
Box 218, Yorktown Heights, NY 10598, phone 
(914) 945-1213, fax (914) 945-2141, e-mail 
berman@ibm.com. 

Iecon 91,17th Conf. of the IEEE Industrial 
Electronics Society: Oct. 28-Nov. 1, 1991, 
Kobe, Japan. Submit paper to Hiro Haneda, 
Electronics Engineering Dept., Kobe Univ., 
Rokko-dai, Nada-ku, Kobe City, Hyogo 857, 
Japan, phone 81 (78) 881-1212, fax 81 (78) 
861-7879. 

Int’l J. Computer-Aided VLSI Design plans a 
special issue on VLSI/systolic arrays. Pub¬ 
lisher: Ablex. Submit five copies of full papers 
by Aug. 30, 1990, to Bijan Karimi, Electrical 
and Computer Engineering Dept., Univ. of 
New Haven, West Haven, CT 06516, phone 
(203) 932-7164. 

® ETC 91,1991 European Test Conf.: 

Apr. 17-19, 1991, Munich, West Ger¬ 
many. Sponsor: VDE (Zentralstelle Tagungen 
und Seminare). Submit four copies of abstract 
or full paper by Aug. 31,1990, to ETC 91, c/o 
Bennetts Associates, Burridge Farm, Bur- 
ridge, Southampton S03 7BY, UK, fax (44) 
489-579519. 

® CAIA 91, Seventh IEEE Conf. on Arti¬ 
ficial Intelligence Applications: Feb. 
24-28, 1991, Miami Beach, Fla. Submit paper 
by Aug. 31,1990, to Tim Finin, Center for Ad¬ 
vanced Information Technology, Unisys, 70E 
Swedesford Rd„ PO Box 517, Paoli, PA 19301, 
phone (215) 648-2840, fax (215) 648-2288, 
e-mail finin@prc.unisys.com. 

IEEE Trans. Reliability plans a special issue 
n design for reliability of telecommunication 
systems and services. Submit author letter of 
commitment (including brief paper descrip¬ 
tion) by Sept. 1,1990, and six copies of the 
manuscript by Nov. 15,1990, to Andrew Reib- 
an, AT&T Bell Labs, Rm. 2L-5I8, Holmdel, 
NJ 07733, phone (201) 949-1930, fax (201) 
949-7724, e-mail alr@hoqaa.att.com; or C.S. 
Raghavendra, EE-Systems Dept., SAL 300, 
Univ. of Southern California, Los Angeles, CA 
90089, phone (213) 743-5532, fax (213) 745- 
7284, e-mail raghu@surya.usc.edu. 

Int’l J. of Computer Simulation plans a spe- 
ssue in 1991 on distributed file system and 


114 


COMPUTER 











CALL FOR PAPERS 


database simulation. Publisher: Ablex. Submit 
four copies of complete manuscript by Sept. 1, 
1990, to Darrell D.E. Long, Computer and In¬ 
formation Sciences, Univ. of California, Santa 
Cruz, CA 95064, phone (408) 429-2616, 
e-mail darrell@cis.ucsc.edu on Internet. 


Systems/USA Technology Conf.: Feb. 11-13, 
1991, Anaheim, Calif. Sponsor: American 
Electronics Assoc. Submit abstract by Sept. 1, 
1990, to Roy Webster, AEA, 5201 Great Amer¬ 
ica Pkwy., Santa Clara, CA 95054, phone (503) 
359-5873, fax (503) 357-3839. 


Call for papers and referees for Computer 


seeks articles for in- 
1991 special issues. 


Computer Generated Music has 

been selected as the theme for the 
July 1991 edition. The issue will be de¬ 
voted to examining the driving forces in 
the field from a computational stand¬ 
point, assessing the limits of computer 
music in the general music field, and 
discussing future desirable directions. 
See the April 1990 issue of Computer 
(p. 127) for complete information. 

Abstracts are due by August 30, 
1990, and four copies of the full manu¬ 
script and four audio cassettes are due 
by October 30,1990. Notification of 
acceptance is set no later than Decem¬ 
ber 31, 1990, and the final version of 
the manuscript is due no later than 
March 30, 1991. 

Submissions should be sent to 
Denis Baggi, Istituto Dalle Molle per 
Studi sull’ Intelligenza Artificiale, 

Corso Elvezia 36, 6900 Lugano, Swit- 
| zerland, phone 41 (91) 56 15 78, Eu¬ 

rope e-mail denis%idsia.uucp@ 
chx400.switch.ch, US e-mail baggi@ 
berkeley.edu. 

Real-Time Systems will be the 
theme of the May 1991 edition. Tuto¬ 
rial, survey, case-study, or pedagogic 
manuscripts are sought. See the July 
1990 issue of Computer (p. 120) for 
complete information. 

Eight copies of the full manuscript 
are due by September 1,1990. Notifi¬ 
cation of decisions is set no later than 
December 1,1990, and the final ver¬ 
sion of the manuscript is due no later 
than February 1, 1991. 

Submissions and questions should 
be directed to either of the guest edi¬ 
tors, Yann-Hang Lee, Computer and 
Information Science, University of 
Florida, Gainesville, FL 32611, phone 
(904) 392-1536, e-mail yhlee@cis.ufl. 
edu; or C.M. Krishna, Dept, of Electri¬ 
cal and Computer Engineering, Uni¬ 
versity of Massachusetts, Amherst MA 
01003, phone (413) 545-0766, e-mail 
krishna@ecs.umass.edu. 


For submittal to Computer, 
manuscripts must not have been 
previously published or currently 
submitted for publication else¬ 
where. Each manuscript should be 
no more than 32 typewritten, 
double-spaced pages long, includ¬ 
ing all text, figures, and references. 
Each submittal should include a 
cover page that contains the title of 
the article, the full name(s) and 
affiliation(s) of the author(s), com¬ 
plete postal and electronic 
address(es) of all the authors as 
well as their telephone and fax 
number(s), a 300-word abstract, 
and a list of keywords identifying 
the central issues of the manu¬ 
script’s contents. The final manu¬ 
script should be approximately 
8,000 words in length and contain 
no more than 12 references. 

If you are willing to review articles 
for these special issues, please 
send a note listing your research in¬ 
terests to Bruce Shriver, editor-in- 
chief of Computer or to one of the 
guest editors listed for the particu¬ 
lar issue. Shriver may be reached at 
the University of Southwestern 
Louisiana, PO Drawer 42730, 
Lafayette, LA 70504-2730, phone 
(318) 231-5811, fax (318) 265- 
5472, e-mail b.shriver on 
Compmail+ or shriver@usl.edu on 
Internet. 


Distributed Computing Systems 

has been selected as the theme for the 
August 1991 issue. Prospective au¬ 
thors are invited to submit tutorial, sur¬ 
vey, descriptive, case-study, applica¬ 
tion-oriented, or pedagogic manu¬ 
scripts. See the July 1990 issue of 
Computer (p. 120) for complete infor¬ 
mation. 

Abstracts are due by November 15, 
1990, and the deadline for full manu- 


Electrosoft plans a special issue on software 
for system transient modeling. Publisher: 
Computational Mechanics. Submit paper by 
Sept. 2, 1990, to H.W. Dommel, EE Dept., 
Univ. of British Columbia, 2356 Main Hall, 
Vancouver, B.C., Canada V6T 1W5. 


scripts is January 1,1991. Notification 
of decisions is set no later than March 
15,1991, and the final version of the 
manuscript is due no later than May 1, 

1991. 

Submittals and questions should be 
directed to either of the guest editors, 
Mukesh Singhal, Dept, of Computer 
and Information Science, Ohio State 
University, Columbus, OH 43210, 
phone (614) 292-5839, e-mail 
singhal@cis.ohio-state.edu; or Tho¬ 
mas L. Casavant, Dept, of Electrical 
and Computer Engineering, University 
of Iowa, Iowa City, IA 52242, phone 
(319) 335-5953, e-mail tomc@eng. 
uiowa.edu. 

Heterogeneous Distributed Data¬ 
base Systems is the theme planned for 
the December 1991 issue. Although 
not limited to the following, potential 
topics of interest include 

• Autonomy in multidatabase envi¬ 
ronments 

• Interdatabase dependencies 

• Transaction management 
heterogeneous environments 

• Tools for building federated sys¬ 
tems 

• Use of new paradigms (such as the 
object-oriented approach) 

• Conceptual modeling in heteroge¬ 
neous multimedia systems 

• Semantic query processing in multi¬ 
database systems 

Abstracts of the manuscripts are due 
no later than January 1,1991, and 
eight copies of the full manuscripts 
must be submitted by April 1,1991. No¬ 
tification of decisions is July 1, 1991, 
and the final version of each manu¬ 
script is due September 1,1991. 

Submissions and questions should 
be directed to Sudha Ram, Department 
of Management Information Systems, 
Eller School of Management, Univer¬ 
sity of Arizona, Tucson, AZ 85721, 
phone (602) 621-2748, e-mail ram@ 
mis.arizona.edu on Internet or ram@ 
arizmis on Bitnet. 


August 1990 













OTM EDAC 91, European Design Automa- 
tion Conf.: Feb. 25-28, 1991, Amster¬ 
dam. Cosponsor: Institution of Electrical En¬ 
gineers. Submit paper by Sept. 3,1990, to Sec¬ 
retariat, EDAC 91, CEP Consultants, 26-28 
Albany St., Edinburgh EH1 3QH, Scotland, 
phone 44 (31) 557-2478, fax 44 (31) 557-5749. 

CG Int’I 91: June 22-28, 1991, Cambridge, 
Mass. Cosponsors: Computer Graphics Soci¬ 
ety, MIT. Submit six copies of summary by 
Sept. 4,1990, and six copies of full paper by 
Nov. 5,1990, to N.M. Patrikalakis, MIT Rm. 5- 
428, 77 Massachusetts Ave., Cambridge, MA 
02139, phone (617) 253-4555, fax (617) 253- 
8125, e-mail nmp@deslab.mit.edu. 

Auto Carto 10,10th Int’I Symp. on Auto¬ 
mated Cartography: Mar. 25-28, 1991. Co¬ 
sponsors: American Cartographic Assoc, et al. 
Submit five copies of full draft paper by Sept. 

7, 1990, to Auto Carto 10, Geography Dept., 
105 Wilkeson, North Campus, State Univ. of 
New York at Buffalo, Amherst, NY 14260, 
phone (716) 636-2545, fax (716) 636-2329. 

|£§£j) ICSE 13,13th Int’l Conf. on Software 
vs? Engineering: May 13-16, 1991, Austin, 
Texas. Cosponsor: ACM. Submit eight copies 
of paper by Sept. 14,1990, to David Barstow, 
Schlumberger Lab for Computer Science, PO 
Box 200015, Austin, TX 78720-0015. 

© Second Int’I Conf. on Systems Integra¬ 
tion: Apr. 22-25, 1991, Morristown, N.J. 
Cosponsors: New Jersey Inst, of Technology et 
al. Submit six copies of paper and abstract by 
Sept. 14,1990, to Raymond T. Yeh, c/o Peter 
A. Ng, CIS Dept., New Jersey Inst, of Technol¬ 
ogy, University Heights, Newark, NJ 07102, 
phone (201) 596-3387, e-mail ng_p@vienna. 
njit.edu. 

Ada-Europe Athens 91 Conf.: May 13-17, 
1991, Athens. Cosponsors: Ada-Europe et al. 
Submit extended abstract by Sept. 14,1990, to 
Dimitris Christodoulakis, Univ. of Patras, 
Computer Engineering Dept, and Computer 
Technology Inst., GR - 26500 Patras, phone 30 
(61) 991-650, fax 30 (61) 991-909, e-mail 
dxri@grpatvx 1 .bitnet. 

IMS 91, First IEEE Int’l Workshop on 
Interoperability in Multidatabase 

Systems: Apr. 8-9, 1991, Kyoto, Japan. Sub¬ 
mit seven copies of extended abstract by Sept. 
15,1990, to Marek Rusinkiewicz, Univ. of 
Houston, Computer Science Dept., Houston, 
TX 77204-3475, phone (713) 749-4791, 
e-mail marek@cs.uh.edu; or Yahiko Kamba- 
yashi, Kyushu Univ., Computer Science and 
Computer Engineering Dept., Hakozaki, Fu¬ 
kuoka 812, Japan, fax 81 (92) 641-1825, e-mail 
yahiko@csce.kyushu-u.ac.jp. 

CCW 91, Third IEEE Conf. on Com- 
puter Workstations: May 15-17, 1991, 
Cape Cod, Mass. Sponsor: IEEE Computer So¬ 
ciety Technical Committee on Operating Sys¬ 
tems. Submit five copies of paper by Sept. 15, 
1990, to Keith Marzullo, Computer Science 
Dept., Upson Hall, Cornell Univ., Ithaca, NY 
14853. 

I^f^l Dasfaa 91, Second Int’l Symp. on 
vU' Database Systems for Advanced Ap¬ 


plications: Apr. 2-4, 1991, Tokyo. Sponsor: 
Information Processing Society of Japan. Sub¬ 
mit three copies of full paper by Sept. 15,1990, 
to Akifuimi Makinouchi, Computer Science 
and Communication Engineering Dept., Kyu¬ 
shu Univ., Hakozaki 6-10-1, Fukuoka 812, Ja¬ 
pan, phone 81 (92) 641-1101, ext. 6055, fax 81 
(92) 641-1101, ext. 5418, e-mail akifumi@ 
vax88.csce.kyushu-u.ac.jp. 

SESAW, Fourth Software Engineer¬ 
ing Standard Application Workshop: 

May 21-23, 1991, San Diego, Calif. Submit 
1,000-word abstract by Sept. 15, 1990, and 
four copies of manuscript by Jan. 1,1991, to 
David Card, Computer Sciences Corp., 4061 
Powder Mill Rd., Calverton, MD 20705. 


ISSCC 91, 1991 IEEE Int’l Solid-State Cir¬ 
cuits Conf.: Feb. 13-15, 1991, San Francisco, 
Calif. Sponsors: IEEE Solid-State Circuits 
Council et al. Submit 30 copies of abstract and 
summary by Sept. 26,1990, to John H. Wuor- 
inen, 2 School St., PO Box 304, Castine, ME 
04421, phone (207) 326-8811. 

Int’l J. of Computer Simulation plans a spe¬ 
cial issue in 1991 on distributed simulation. 
Publisher: Ablex. Submit five copies of com¬ 
plete manuscript by Sept. 31, 1990, to Bojan 
Groselj, Center for Advanced Computer Stud¬ 
ies, Univ. of Southwestern Louisiana, PO Box 
44330, Lafayette, LA 70504, phone (318) 231- 
6606, fax (318) 231-5791, e-mail bojan@cacs. 


/jgjj) CHDL 91,10th Int’l Symp. on Com- 
puter Hardware Description Lan¬ 
guages and their Applications: Apr. 22-24, 
1991, Marseille, France. Cosponsors: Int’l 
Federation for Information Processing et al. 
Submit five copies of full paper by Sept. 15, 

1990, to Ronald Waxman, EE Department, 
Thornton Hall, Univ. of Virginia, Charlot¬ 
tesville, VA 22903-2442, phone (804) 924- 
6086, fax (804) 924-8818, e-mail ronw 
@virginia.edu. 

RTA 91, Fourth Int’l Conf. on Rewriting 
Techniques and Applications: Apr. 10-12, 

1991, Como, Italy. Sponsors: State Univ. of 
Milan. Submit 10 copies of extended abstract 
or full paper by Sept. 15,1990, to Ronald V. 
Book, Theoretische Informatik, Inst, fur Infor- 
matik, Univ. Wurzburg, Am Hubland, D-8700 
Wurzburg, West Germany, US phone (805) 
961-2778, e-mail book%henri@hub.ucsb. 
edu. 

Fifth Int’l Parallel Processing Symp.: Mar. 
27-29, 1991, Newport Beach, Calif. Submit 
four copies of complete paper or 1,000-word 
summary by Sept. 15,1990, to V.K. Prasanna 
Kumar, Electrical Engineering-Systems 
Dept., SAL 344, Univ. of Southern California, 
Los Angeles, CA 90089-0781, phone (213) 
743-5236, fax (213) 745-7284, e-mail ipps@ 
ashoka.usc.edu. 

IEEE J. Solid-State Circuits plans a series of 
special issues on microelectronics systems. 
Submit five copies of complete paper by Sept. 
15, 1990, to Donald W. Bouldin, Electrical and 
Computer Engineering, Univ. of Tennessee, 
Knoxville, TN 37996-2100, phone (615) 974- 
5444, fax (615) 974-5492, e-mail bouldin@ 
sunl.engr.utk.edu. 

24th Computer Simulation Conf.: Apr. 1-5, 
1991, New Orleans. Sponsor: Society for Com¬ 
puter Simulation. Submit abstract by Sept. 15, 
1990, and full paper by Dec. 1,1990, to George 
W. Zobrist, Computer Science Dept., Univ. of 
Missouri at Rolla, Rolla, MO 65401, phone 
(314) 341-4836, e-mail c2816@umrvmb.umr. 
edu. 

1991 IEEE Int’l Conf. on Robotics and Auto¬ 
mation: Apr. 7-12, 1991, Sacramento, Calif. 
Sponsor: IEEE Robotics and Automation Soci¬ 
ety. Submit four copies of paper by Sept. 16, 
1990, to T.J. Tam, Systems Science and Mathe 
matics, Campus Box 1040, Washington Univ., 
St. Louis, MO 63130. 


Fourth Int’l Conf. on Industrial and 
Engineering Applications of Artificial 
Intelligence and Expert Systems: June 2-5, 
1991, Kauai, Hawaii. Sponsors: ACM et al. 
Submit four copies of extended abstract by 
Oct. 1,1990, to Jim Bezdek, Computer Science 
Div., Univ. of West Florida, Pensacola, FL 
32514, phone (904) 474-2784, fax (904) 474- 
2096, e-mail jbezdek@uwf.bitnet. 

CHI 91,1991 Conf. on Human Factors 
in Computing Systems: Apr. 28-May 2, 
1991, New Orleans. Sponsor: ACM. Submit 
six copies of abstract/paper by Oct. 1,1990, to 
Peter Poison, Psychology Dept., Univ. of 
Colorado, Muenzinger Hall, Campus Box 345, 
Boulder, CO 80309-0345, phone (303) 492- 
5622, e-mail ppolson@clipr.colorado.edu. 


IEEE Trans, on Parallel and Distributed Sys¬ 
tems plans a special issue in July 1991 on paral¬ 
lel languages and compilers. Submit six copies 
of paper by Oct. 1,1990, to David Padua, Cen¬ 
ter for Supercomputing Research and Devel¬ 
opment, Univ. of Illinois, Urbana, IL 61801, 
phone (217) 333-4223, e-mail padua@uicsrd. 
csrd.uiuc.edu; Benjamin Wah, Coordinated 
Science Lab., Univ. of Illinois, 1101 W. 
Springfield, Ave., Urbana,IL 61801, phone 
(217) 333-3516, e-mail wah%acquinas@uxc. 
cso.uiuc.edu; or Pen-Chung Yew, Center for 
Supercomputing Research and Development, 
Univ. of Illinois, Urbana, IL 61801, phone 
(217) 244-0045, e-mail yew@uicsrd.csrd. 
uicu.edu. 


1991 IEEE Int’l Symp. on Information The¬ 
ory: June 23-29, 1991, Budapest, Hungary. 
Submit short paper by Oct. 1,1990, and long 
paper by Nov. 1,1990, to Anthony Ephrem- 
ides, Electrical Engineering Dept., Univ. of 
Maryland, College Park, MD 20742, phone 
(301) 454-6871, e-mail tony@eng.umd.edu. 

ISCAS 91, 24th IEEE Int’l Symp. on Cir¬ 
cuits and Systems: June 11-14, 1991, Singa¬ 
pore. Sponsor: IEEE Circuits and Systems So¬ 
ciety. Submit six copies of summary by Oct. 1, 
1990, to Technical Program Chair, ISCAS 91 
Secretariat, Communication Int’l Associates, 
44/46 Tanjong Pagar Rd., Singapore 0208, 
phone (65) 226-2823, fax (65) 226-2877. 

First Int’l Workshop on Performability 
Modeling of Computer and Communication 

Systems: Feb. 18-19, 1991, Enschede, The 
Netherlands. Submit three copies of abstract or 
paper by Oct. 1,1990, to Nico M. van Dijk, 


116 


COMPUTER 






Free Univ., Faculty of Economics, PO Box 
7161, 1007 MC Amsterdam, The Netherlands. 


Fifth Int’l Workshop on High-Level Synthe¬ 
sis: Mar. 3-6, 1991, Buhlerhohe, West Ger¬ 
many. Cosponsors: IEEE et al. Submit 12 cop¬ 
ies of extended summary by Oct. 8,1990, to 
Wolfgang Rosenstiel, Forschungszentrum In- 
formatik an der Univ. Karlsruhe, Haid-und- 
Neu Strasse 10-14, D-7500 Karlsruhe, FRG. 

Ninth IEEE VLSI Test Symp.: Apr. 16- 
18, 1991, Atlantic City, N.J. Cosponsor: 
IEEE Philadelphia Section. Submit abstract 
(50 words) and summary (200-300 words) by 
Oct. 19,1990, to Kedong Chao, Johns Hopkins 
Univ., Applied Physics Lab, John Hopkins 
Road Bldg. 23-295, Laurel, MD 20723, phone 
(301) 953-6121, fax (301) 953-1093. 

ICDCS 91,11th Int’l Conf. on Distrib- 
uted Computing Systems: May 20-24, 
1991, Arlington, Texas. Submit five copies of 
abstract and paper by Oct. 23,1990, to Ben¬ 
jamin W. Wah, ICDCS 91, Coordinated Sci¬ 
ence Lab, MC228, Univ. of Illinois, 1101 W. 
Springfield Ave., Urbana, IL 61801-3082, 
phone (217) 333-3516, fax (217) 244-1764, 
e-mail wah'/ aquinas@uxc.cso.uiuc.edu. 


CALENDAR 


® ln the accompanying Calendar, the IEEE Computer Society logo identifies 
the conferences the society is sponsoring or participating in. Other confer¬ 
ences of interest to our readers, as well as their sponsors, are also listed. 

For inclusion in Call for Papers or Calendar, submit the following information: 
event name, date(s), location, and sponsor(s) as well as the phone and fax num¬ 
bers and the electronic address of the person to contact. In addition, for Calls for 
Papers listings, include the name of the person to whom papers should be submit¬ 
ted and the deadline for submittals. 

Computer should receive the above-mentioned information at least five weeks 
before the month of publication (i.e., for the October 1990 issue, send informa¬ 
tion for receipt by August 20,1990) to Chuck Governale, Calendar Dept., Com¬ 
puter, PO Box 3014, Los Alamitos, CA 90720-1264. 


August 1990 


gerberg, CH-8093, Zurich, Switzerland, 
phone 41 (1) 377-3051. 


® CVPR 91, IEEE Computer Society 

Conf. on Computer Vision and Pattern 
Recognition: June 3-7, 1991, Lahaina, Maui, 
Hawaii. Submit four copies of complete paper 
by Nov. 12, 1990, to Gerard Medioni, Inst, for 
Robotics and Intelligent Systems, PHE 204, 
me 0273, Univ. of Southern California, Los 
Angeles, CA 90089-0273, e-mail medioni@ 
dworkin.usc.edu. 

® SCM 3, Third Int’l Workshop on Soft¬ 
ware Configuration Management: 

June 12-14, 1991, Trondheim, Norway. Co¬ 
sponsors: ACM et al. Submit four copies of po¬ 
sition paper and full paper by Nov. 15,1990, to 
Peter Feiler, Software Engineering Inst., Car¬ 
negie Mellon Univ., Pittsburgh, PA 15213- 
3890, phone (412) 268-7790, e-mail phf@sei. 


Symp. on Experiences with Distrib- 
uted and Multiprocessor Systems: 
Mar. 21-22, 1991, Atlanta. Sponsor: Usenix 
Assoc. Submit 10 copies of full paper by Nov. 
19, 1990, to Gene Spafford, Software Engi¬ 
neering Research Center, Computer Sciences 
Dept., Purdue Univ., West Lafayette, IN 
47907-2004, phone (317) 494-7825, e-mail 
spaf@cs.purdue.edu. 

® ISCA 18,18th Int’l Symp. on Com¬ 
puter Architecture: May 27-30, 1991, 
Toronto, Canada. Cosponsor: ACM. Submit 
five copies of manuscript by Nov. 21,1990, to 
John Hayes, Electrical Engineering and Com¬ 
puter Science Dept., Univ. of Michigan, 1301 
Beal Ave., Ann Arbor, MI 48109, phone (313) 
763-0386. 

Sixth Int’l Workshop on Software 
Specification and Design: Oct. 25-26, 
1991, Como, Italy. Submit five copies of regu¬ 
lar or position paper by Jan. 21,1991, to Carlo 
Ghezzi, Dip. di Elettronica Politecnico di Mi¬ 
lano, Piazza Leonardo Da Vinci 32,20133 Mi¬ 
lano, Italia, e-mail relett24@imipoli.bitnet. 


UP ADI 90, 21st Convention of the Pan 
American Federation of Engineering Socie¬ 
ties, Aug. 19-24, Washington, DC. Cospon¬ 
sors: American Assoc, of Engineering Socie¬ 
ties, American Society of Civil Engineers. 
Contact UPADI90, ASCE, 345 E. 47th St., 

New York, NY 10017, phone (212) 705-7218. 

(£§j\ Hot Chips II, Symp. on High-Perfor- 
mance Chips, Aug. 20-21, Santa Clara, 
Calif. Sponsor: IEEE Computer Society Tech¬ 
nical Committee on Microprocessors and Mi¬ 
crocomputers. Contact Hasan S. Alkhatib, 
EECS Dept., Santa Clara Univ., Santa Clara, 
CA 95053, phone (408) 554-4485, fax (408) 
554-5474, e-mail halkhatib@scu.edu. 

Second Int’l Joint Conf. of ISSAC 90 (1990 
Int’l Symp. on Symbolic and Algebraic 
Computation) and AAECC 8 (Eighth Int’l 
Conf. on Applied Algebra, Algebraic Algo¬ 
rithms, and Error-Correcting Codes), Aug. 
20-24, Tokyo. Cosponsors: ACM et al. Contact 
Conf. Secretariat, IJC-2, Scientist, Inc., Yama- 
zaki Bldg., 3-2 Kanda Suruga-dai, Chiyoda- 
ku, Tokyo 101, Japan. 

Coiing 90,13th Int’l Conf. on Computa¬ 
tional Linguistics, Aug. 20-25, Helsinki, Fin¬ 
land. Contact Hans Karlgren, KVAL, Skepps- 
bron 26, S-l 11 30 Stockholm, Sweden, phone 
46 (8) 789-6683. 


September 1990 


^ ISPRS Commission V Symp., Close- 
* Range Photogrammetry Meets Ma¬ 
chine Vision, Sept. 3-7, Zurich. Cosponsor: 
Int’l Society for Photogrammetry and Remote 
Sensing et al. Contact Armin Gruen, Inst, of 
Geodesy and Photogrammetry, ETH-Hoeng- 


EuroVHDL 90, First European Work- 
ing Conf. on VHDL Methods, Sept. 4- 

7, Marseille, France. Cosponsors: ACM et al. 
Contact Petra Michel, Siemens, A.G. Dept. 
ZFEISEA1, Otto Hahn Ring 6, Munich 83, 
West Germany. 


ASAP 90, Int’l Conf. on Application¬ 
's^ Specific Array Processors, Sept. 5-7, 
Princeton, N.J. Sponsor: Princeton Univ. Con¬ 
tact S.Y. Kung, Electrical Engineering Dept., 
Princeton Univ., Princeton, NJ 08544, phone 
(609) 258-3780. 


Fifth IEEE Int’l Symp. on Intelligent Con¬ 
trol, Sept. 5-7, Philadelphia. Sponsor: IEEE 
Control Systems Society. Contact Alex Ney- 
stel, Jayantha Herath, or Steve Gray, Drexel 
Univ., ECE Dept., Philadelphia, PA 19104, 
phone (215) 895-2220, 6758, or 6762. 


13th Int’l ACM/SIGIR Conf. on Research 
and Development in Information Retrieval, 
Sept. 5-7, Brussels. Contact Jean-Luc Vidick, 
Univ. Libre de Bruxelles, Avenue F.D. Roose¬ 
velt, Infodoc, C.P. 142, 1050 Brussels, Bel¬ 
gium. 


Int’l Workshop on VLSI for Artificial Intel¬ 
ligence and Neural Networks, Sept. 5-7, Ox¬ 
ford, England. Contact Jose G. Delgado-Frias, 
Electrical Engineering Dept., SUNY, Bing¬ 
hamton, NY 13901, phone (607) 777-4806, e- 
mail delgado@bingvaxu.cc.binghamton.edu. 

1990 Int’l Electronics Packaging Conf., 
Sept. 9-13, Marlborough, Mass. Sponsor: Int’l 
Electronics Packaging Society. Contact IEPS, 
114 N. Hale St., Wheaton, IL 60187, phone 
(708) 260-1044. 


Workshop on Computers in Systematic Bi¬ 
ology, Sept. 9-14, Davis, Calif. Sponsor: Nat’l 


August 1990 


117 













Science Foundation. Contact Renaud For¬ 
tune!, California Dept, of Food and Agricul¬ 
ture, Analysis and Identification, Rm. 340, PO 
Box 942871, Sacramento, CA 94271-0001, 
phone (916) 445-4521. 

ITC 90, Int’l Test Conf., Sept. 10-14, 

'—' Washington, DC. Cosponsor: IEEE 
Philadelphia Section. Contact Donald Den- 
burg, AT&T Bell Labs, 1247 S. Cedar Crest 
Blvd., Allentown, PA 18103; or ITC, 1201 
Sussex Turnpike, Suite 101, PO Box 264, Mt. 
Freedom, NJ 07970, phone (201) 895-5260, 
fax (201) 895-7265. 



12, Washington, DC. Sponsor: IEEE Com¬ 
puter Society Technical Committee on Expert 
Systems. Contact Jay Liebowitz, Management 
Sciences Dept., George Washington Univ., 
Washington, DC, phone (202) 994-6969. 

Second Int’l Workshop on Advances in Ro¬ 
bot Kinematics, Sept. 10-12, Linz, Austria. 
Sponsors: Research Inst, for Symbolic Com¬ 
putation et al. Contact Sabine Stifler, RISC, 
Johannes Kepler Utliv., A-4040 Linz, Austria, 
phone 43 (7236) 3231-50; or Jadran Lenarcic, 
Josef Stefan Inst., Univ. of Edvard Kardelj, 
Jamova 39, 61111 Ljubljana, Yugoslavia, 
phone 38 (61) 214-399. 


Symp. on Object-Oriented Programming 
Emphasizing Practical Applications, Sept. 
14-15, Poughkeepsie, N.Y. Sponsor: Marist 
College. Contact James TenEyck, Marist Col¬ 
lege, Poughkeepsie, NY 12601-1387, phone 
(914) 471-3240, e-mail jzbvtffimaristb.bitnet. 


ffjfl ICCD 90, IEEE Int’l Conf. on Com- 
puter Design: VLSI in Computers and 
Processors, Sept. 16-19, Cambridge, Mass. 
Contact ICCD 90, IEEE Computer Society, 
1730 Massachusetts Ave. NW, Washington, 
DC 20036-1903, phone (202) 371-1013. 


Fourth Digital Signal Processing Work¬ 
shop, Sept. 16-19, New Paltz, N.Y. Sponsor: 
IEEE Signal Processing Society. Contact K.S. 
Arun, Coordinated Science Lab, Univ. of Illi¬ 
nois at Urbana-Champaign, 1101 W, Spring- 
field Ave., Urbana, IL 61801, phone (217) 333- 
7678, fax (217) 244-1764. 


Internal Audit Advanced Technology Fo¬ 
rum, Sept. 17-19, Orlando, Fla. Sponsor: Inst, 
of Internal Auditors. Contact Stephen M. Par- 
oby, Ernst and Young, 787 Seventh Ave., New 
York, NY 10019, phone (212) 830-6000. 


rfrfjt ASIC 90, Third IEEE ASIC Seminar 
and Exhibit, Sept. 17-21, Rochester, 
N.Y. Cosponsors: IEEE Rochester Section, 
ACM. Contact Kenneth Hsu, Rochester Inst, of 
Technology, Computer Engineering Dept., 
Rochester, NY 14623, phone (716) 475-2655; 
or Lynne Engelbrecht, 170 Mt. Read Blvd., 
Rochester, NY 14611, phone (716) 328-2310, 
fax (716) 436-9370. 

tjfjN EP 90, Electronic Publishing 90, Sept. 

18-20, Gaithersburg, Md. Sponsor: Nat’l 
Inst, of Standards and Technology. Contact 


Peter R. King, Computer Science Dept., Univ. 
of Manitoba, Winnipeg, Man., Canada R3T 
2N2, phone (204) 474-9935. 

1CARCV 90, Int’l Conf. on Automation, 
Robotics, and Computer Vision, Sept. 18- 

21, Singapore. Cosponsors: IEEE Singapore 
Chapter et al. Contact Dinesh P. Mital, 
ICARCV 90, School of Electrical and Elec¬ 
tronic Engineering, Nanyang Technological 
Inst., Nanyang Ave., Singapore 2263, Repub¬ 
lic of Singapore, phone (65) 660-5399. 

Conf. on Multiuser Interfaces and Applica¬ 
tions, Sept. 24-26, Heraklion, Crete, Greece. 
Cosponsors: IFIP et al. Contact Rena Kalaitza- 
ki. Computer Science Dept., Univ. of Crete, 

GR 714-09 Heraklion, Crete, Greece, phone 30 
(81) 210-057. 

Int’l Workshop on Expert Systems in Engi¬ 
neering, Sept. 24-26, Vienna, Austria. Spon¬ 
sor: Christian Doppler Expert Systems Lab, 
Univ. of Vienna. Contact Wolfgang Nejdl, 
Technical Univ. of Vienna, Applied Computer 
Science Dept., CD Lab for Expert Systems, 
Paniglgasse 16, 1040 Vienna, Austria, fax 43 
(222) 505-5304, e-mail nejdl@vexpert.at. 

Tencon 90, IEEE Region 10 Conf. on Com¬ 
puter and Communication Systems, Sept. 
24-27, Hong Kong. Cosponsor: IEEE Hong 
Kong Section. Contact Y.S. Cheung, Electrical 
and Electronic Engineering Dept., Univ. of 
Hong Kong, Pokfulam, Hong Kong. 

SIGComm 90, Sept. 24-27, Philadelphia. 
Sponsor: ACM SIGComm. Contact David Far- 
ber, Univ. of Pennsylvania, 200 S. 33rd St., 
Philadelphia, PA 19104-6389, phone (215) 
898-9508, fax (215) 898-0587, e-mail 
farber@cis.upenn.edu; or Phil Kam, Bell 
Communications Research, MS 2P-357, 445 
South St., PO Box 1910, Morristown, NJ 
07962-1910, phone (201) 829-4299. 

Fifth Knowledge-Based Software Assistant 
Conf., Sept. 24-28, Syracuse, N.Y. Sponsor: 
Rome Air Development Center. Contact Bar¬ 
bara Radzisz, Data and Analysis Center for 
Software, PO Box 120, Utica, NY 13503, 
phone (315) 336-0937. 


AIRIES 90, Al Research in the Environ¬ 
mental Sciences Workshop, Sept. 25-27, 

Montreal, Que., Canada. Cosponsors: Univ. of 
Quebec at Montreal, Centre Researche Infor- 
matique de Montreal. Contact Rosemary M. 
Dyer, GL/LYP, AIRIES 90, Air Force Geo¬ 
physics Lab, Hanscom Air Force Base, MA 
01731, fax (617) 377-4498. 

Fourth Conf. on Putting Methods and Tools 
into Practice as Aids to Design Information 
Systems, Sept. 25-27, Nantes, France. Spon¬ 
sor: Univ. de Nantes, Inst. Univ. de Technolo- 
gie. Lab. d’Informatique, Liana. Contact H. 
Habrias, 3 Rue du Marechal Joffre, 44041 Nan¬ 
tes Cedex 01, France, phone (33) 4030-6090, 
fax (33) 4030-6001. 


OTll Cl 90,1990 Int’l Symp. on Computa- 
vU' tional Intelligence, Sept. 27-29, Mi¬ 
lano, Italy. Sponsors: ACM, F.I.S. Cassa di 
Rosp. o. PC. Contact Giorgio Valle, Universita 
Milano. Dip. Scienze Della Informazione, Via 


Moretto 20133, Milano, Italy, phone 39 (2) 
757-5228, fax 39 (2) 761-10556, e-mail 
valle@imiucca.bitnet. 


Future Trends 90, Workshop on Fu- 
ture Trends of Distributed Computing 
Systems, Sept. 30-Oct. 2, Cairo. Contact Ste¬ 
phen S. Yau, Univ. of Florida, CIS Dept., Rm. 
301, Gainesville, FL 32611, phone (904) 392- 
3261. 


October 1990 


yra 15th Conf. on Local Computer 

Networking, Oct. 1-3, Minneapolis, 
Minn. Contact Marc Cohn, Advanced Devel¬ 
opment Div., Raychem Corp., 300 Constitu¬ 
tion Dr., Menlo Park, CA 94025-1164, phone 
(415) 361-3902, fax (415) 361-6099. 


Second Int’l Conf. on Algebraic and Logic 
Programming, Oct. 1-3, Nancy, France. Con¬ 
tact Wolfgang Wechler, TU Braunschweig, 
Theoretische Informatik, Postfach 3329, D- 
3300 Braunschweig, West Germany, e-mail 
wechlei@infbs.uucp; or Helene Kirchner, 
CRIN, BP239, 54506 Vandoeuvre-les-Nancy 
Cedex, France. 


Infojapan 90, Int’l Conf. on Informa- 
tion Technology, Oct. 1-5, Tokyo. 
Sponsor: Information Processing Society of 
Japan. Contact InfoJapan 90 Secretariat, c/o 
Simul Int’l, Kowa Bldg. No. 9, 1-8-10 Aka- 
saka, Minato-ku, Tokyo 107, Japan, phone 81 
(3) 586-8691, fax 81 (3) 583-8336. 

Sixth Int’l Conf. on the Application of 
vS?' Standards for Open Systems Intercon¬ 
nection, Oct. 2-4, Gaithersburg, Md. Cospon¬ 
sor: Nat’l Inst, of Standards and Technology. 
Contact Brenda Gray, NIST/OSI, Rm. B217, 
Bldg. 225, Gaithersburg, MD 20899, phone 
(301) 975-3664. 


28th Allerton Conf. on Communication, 
Control, and Computing, Oct. 3-5, Mon- 
ticello, Ill. Contact Allerton Conf., c/o Donna 
J. Brown, Univ. of Illinois at Urbana-Cham¬ 
paign, Coordinated Science Lab, 1101 W. 
Springfield, Ave., Urbana, IL 61801, phone 
(217) 244-0581, e-mail djb@uicsl.csl.uiuc. 


(rK) 1990 IEEE Workshop on Visual Lan- 
guages, Oct. 4-6, Skokie, Ill. Sponsors: 
Univ. of Pittsburgh et al. Contact S.K. Chang, 
Computer Science Dept., Univ. of Pittsburgh, 
Pittsburgh, PA 15260. 

/£jj| Frontiers 90, Third Symp. on Fron- 
tiers of Massively Parallel Computa¬ 
tion, Oct. 8-10, College Park, Md. Cospon¬ 
sors: Nat’l IEEE Capital Area Chapter, NASA 
Goddard Space Flight Center. Contact Johan¬ 
na Weinstein, Frontiers 90, UMIACS, Univ. of 
Maryland, A.V. Williams Bldg., College Park, 
MD 20742, phone (301) 454-1808. 


Third UNB Artificial Intelligence Work¬ 
shop, Oct. 9, Fredericton, N.B., Canada. Spon¬ 
sor: Univ. of New Brunswick. Contact B.G. 
Nickerson, School of Computer Science, Univ. 


COMPUTER 






of New Brunswick, PO Box 4400, Fredericton, 
N.B., Canada E3B 5A3, phone (506) 453- 
4566, fax (506) 453-3566, e-mail bgn@unb. 


CTM Ninth Symp. on Reliable Distributed 
vfty Systems, Oct. 9-12, Huntsville, Ala. 
Contact Raif M. Yanney, TRW, MS DH2/ 

2328, 1 Space Park, Redondo Beach, CA 
90278, phone (213) 764-6033. 

Northcon 90, Oct. 9-11, Seattle. Cosponsors: 
IEEE et al. Contact Northcon 90 Professional 
Program Committee, c/o Ramona Baker, 8110 
Airport Blvd., Los Angeles, CA 90045-3194, 
phone (213) 215-3796, ext. 222. 

PDCS 90, ISMM Int’l Conf. on Parallel and 
Distributed Computing and Systems, Oct. 
10-12, New York City. Sponsor: Int’l Society 
for Mini and Microcomputers. Contact R. Am- 
mar, U155, Computer Science and Engineer¬ 
ing Dept., Univ. of Connecticut, Storrs, CT 
06268, fax (203) 486-0318. 

EuroForum 90, Oct. 11-12, Daresbury, 
Cheshire, UK. Contact Kate Faulkner, Euro- 
Forum 90, ICL, Manchester Ml2 5DR, UK 
phone 44 (61) 223-1301, fax 44 (61) 223-1207. 

Second Int’l Conf. on Microelectronics, Oct. 
13-15, Damascus, Syria. Sponsor: Arab 
School of Science and Technology. Contact 
M.I. Elmasry, VLSI Research Group, Univ. of 
Waterloo, Waterloo, Ont., Canada N2L 3G1, 
phone (519) 885-1211, ext. 3753. 

1990 Fall VHDL Users’ Group Meeting, 

Oct. 14-17, Oakland, Calif. Contact Rachel 
Rusting, Intermetrics, 733 Concord Ave., 
Cambridge, MA 02138, phone (617) 661- 
1840. 

AIPR 19, Workshop on Applied Imagery 
Pattern Recognition, Oct. 17-19, McLean, 
Va. Sponsors: Society of Photooptical Instru¬ 
mentation Engineers, Rome Air Development 
Center. Contact Brian Mitchell, ERIM, PO 
Box 8618, Ann Arbor, MI 48106, phone (313) 
994-1200, ext. 2713. 

12th Saudi Nat’l Computer Conf. on Plan¬ 
ning for the Informatics Society, Oct. 21-24, 

Riyadh, Saudi Arabia. Cosponsors: King Saud 
Univ., Saudi Computer Society. Contact M.M. 
Mandurah, College of Computer and Informa¬ 
tion Sciences, PO Box 51178, Riyadh, 11543, 
Saudi Arabia, phone 996 (1) 467-6993. 

OOPSLA 90, Fifth Conf. on Object-Ori¬ 
ented Programming Systems, Languages, 
and Applications, Oct. 21-25, Ottawa, Can¬ 
ada. Sponsor: ACM. Contact Assoc, for Com¬ 
puting Machinery, 11 W. 42nd St., New York, 
NY 10036, phone (212) 869-7440. 

FOCS, 31st Foundations of Computer 
Science, Oct. 22-24, St. Louis, Mo. Con¬ 
tact Christos Papadimitriou, Computer Sci¬ 
ence Dept., Univ. of California at San Diego, 
La Jolla, CA 92093, phone (619) 534-2086. 

Int’l Conf. on Computer Applications in De¬ 
veloping Countries, Oct. 22-24, Benin City, 
Nigeria. Sponsor: Large Scale Systems Re¬ 
search Group, Univ. of Benin. Contact E.A. 


Onibere, Mathematics and Computer Science 
Dept., Univ. of Benin, P.M.B. 1154, Benin 
City, Nigeria. 

Ninth National Conf. on EDP System and 
Software Quality Assurance, Oct. 22-24, 

Washington, DC. Sponsor: Data Processing 
Management Assoc. Contact US Professional 
Development Inst., EDP System and Software 
Quality Assurance, 1734 Elton Rd., Suite 221, 
Silver Spring, MD 20903-1733, phone (301) 
445-4400, fax (301) 445-5722. 

JCIT 5, Fifth Jerusalem Conf. on In- 
formation Technology, Oct. 22-25, 

Jerusalem, Israel. Sponsor: Information Pro¬ 
cessing Assoc, of Israel. Contact Abraham 
Peled, IBM T.J. Watson Research Center, PO 
Box 704, Yorktown Heights, NY 10598. 

CC 90, Third Int’l Workshop on Compiler 
Compilers, Oct. 22-26, Schwerin, East Ger¬ 
many. Sponsors: German Democratic Repub¬ 
lic Academy of Sciences Inst, of Informatics 
and Computing Technique et al. Contact Mi¬ 
chael Albinus, CC 90 Organizing Committee, 
Akademie der Wissenschaften der DDR, Inst, 
fur Informatik und Rechentechnik, Rudower 
Chaussee 5, Berlin, GDR — 1199. 

Third Int’l Symp. on Artificial Intelligence, 
Oct. 22-26, Monterrey, N.L. Mexico. Spon¬ 
sors: ITESM (Inst. Tecnologico y de Estudios 
Superiores de Monterrey) et al. Contact Hugo 
Terashima, Centro de Inteligencia Artificial, 
ITESM, Sue. de Correos “J”, C.P. 64849 Mon¬ 
terrey, N.L. Mexico, phone 52 (83) 58-2000, 
fax 52 (83) 58-0771, e-mail isai@tecmtyvm. 

Visualization 90, Oct. 23-26, San Fran- 
cisco. Contact Bruce Brown, Oracle 
Corp., 20 Davis Dr., Belmont, CA 94002, 
phone (415) 598-3628. 

ESORICS 90, European Symp. on Research 
in Computer Security, Oct. 24-26, Toulouse, 
France. Sponsor: AFCET. Contact Martin 
Gilles, 16 Para de Diane, 78350 Jouy eu Josas, 
Toulouse Cedex, France. 

First Japanese Knowledge Acquisition for 
Knowledge-Based Systems Workshop, Oct. 
25-26, Kyoto, Japan, and Oct. 29-31, Tokyo. 
Cosponsors: Kansai Inst, of Information Sys¬ 
tems et al. Contact John H. Boose, Advanced 
Technology Center, Boeing Computer Ser¬ 
vices 7L-64, PO Box 24346, Seattle, WA 
98124, phone (206) 865-3253. 

/£ji| NACLP 90, 1990 North American 

Conf. on Logic Programming, Oct. 28- 
Nov. 1, Austin, Texas. Cosponsor: ACM. Con¬ 
tact Carlo Zaniolo, MCC, 3500 W. Balcones 
Center Dr., Austin, TX 78759, phone (512) 
338-3442. 

Int’I Conf. on Information Technology, Oct. 
29-31, Bournemouth, UK. Sponsor: Institu¬ 
tion of Electrical Engineers. Contact Conf. 
Services, 1EE, Savoy Place, London WC2R 
0BL, UK, phone 44 (71) 240-1871, fax 44 (71) 
240-7735. 


PNSQC Committee. Contact Terri Moore, Pa¬ 
cific Agenda, PO Box 10142, Portland, OR 
97210, phone (503) 223-8633. 

ISCIA 5, Fifth Int’l Symp. on Computer and 
Information Sciences, Oct. 30-Nov. 2, Cap¬ 
padocia, Nevsehir, Turkey. Sponsors: Istanbul 
Technical Univ. et al. Contact A. Emre Har- 
manci, Istanbul Technical Univ., Bilgi Islem 
Merkezi, Ayazaga, 80626 Istanbul, Turkey, 
phone 090 (1) 176-3254, fax 090 (1) 176-1734, 
e-mail harmanci@tritu.bitnet. 

Compsac 90, 14th Int’l Computer 

Software and Applications Conf., Oct. 
31-Nov. 2, Chicago. Contact Ifay F. Chang, 
Rm. 1B28, IBM T.J. Watson Research Center, 
PO Box 714, Yorktown Heights, NY 10598, 
phone (914) 789-7825, fax (914) 784-6211. 


November 1990 


14th SCAMC, 1990 Symp. on Computer Ap¬ 
plications in Medical Care, Nov. 4-7, Wash¬ 
ington, DC. Cosponsors: George Washington 
Univ. Medical Center et al. Contact SCAMC — 
Office of CEM, George Washington Univ. 
Medical Center, 2300 K St. NW, Washington, 
DC 20037, phone (202) 994-8928. 

(^1 1990 IFIP-IEEE Int’l Workshop on 
Defect and Fault Tolerance in VLSI 
Systems, Nov. 5-7, Grenoble, France. Contact 
Gabriel Saucier, Inst. National Polytechnique 
de Grenoble/CSI, 46 avenue Felix-Viallet, 
38031 Grenoble Cedex, France, phone (33) 76- 
57-46-87, fax (33) 76-50-23-21; or Tulin E. 
Mangir, TRW, 1 Space Park, R2/2036, Re¬ 
dondo Beach, CA 90278, phone (213) 813- 
3894, fax (213) 813-3709. 

24th Asilomar Conf. on Signals, Systems, 
and Computers, Nov. 5-7, Pacific Grove, 
Calif. Sponsors: Naval Postgraduate School et 
al. Contact George M. Dillard, Naval Ocean 
Systems Center, San Diego, CA 92152-5000, 
phone (619) 553-2478. 

ICCS 90, Int’l Conf. on Communication Sys¬ 
tems, Nov. 5-9, Singapore. Cosponsors: IEEE 
Singapore Section et al. Contact ICCS 90, c/o 
Meeting Planners Pte. Ltd., 100 Beach Rd. 
#33-01, Shaw Towers, Singapore 0718, 

Second SIAM Conf. on Linear Algebra in 
Signals, Systems, and Control, Nov. 5-9, San 
Francisco, Calif. Contact Society for Indus¬ 
trial and Applied Mathematics, 3600 Univer¬ 
sity City Science Center, Philadelphia, PA 
19104-2688, phone (215) 382-9800, fax (215) 
386-7999, e-mail siamconfs@wharton.upenn. 

ICCC 90,10th Int’l Conf. on Computer 
Communication, Nov. 5-9, New Delhi, India. 
Sponsor: Int’l Council on Computer Commu¬ 
nication. Contact Saroj Chowla or P.P. Gupta, 
ICCC 90, CMC Ltd., A-5 Ring Rd., South Ex¬ 
tension Part I, New Delhi 110 049, India, phone 
91 (11) 626-807, fax 91 (11) 684-4652. 

Intelligent Robotic Systems: Design and 
Applications, Nov. 6-7, Philadelphia. Spon- 


August 1990 


119 




sor: SPDB. Contact Mohan M. Trivedi, Univ. of 
Tennessee, E&CE, Ferris Hall, Knoxville, TN 
37996-2100, phone (615) 974-5450. 


TAI 90, Second Computer Society 
vS? Int’l Conf. on Tools for Artificial Intel¬ 
ligence, Nov. 6-9, Washington, DC. Cospon¬ 
sors: Rutgers Univ. et al. Contact Nikolas G. 
Bourbakis, IBM, 5600 Cottle Rd., San Jose, 

CA 95193, phone (408) 270-3455. 


(3^1 IEEE Workshop on the Management 
of Replicated Data, Nov. 7-9, Houston. 
Sponsor: IEEE Computer Society Technical 
Committee on Operating Systems. Contact 
Luis-Felipe Cabrera, IBM Almaden Research 
Center, 650 Harry Rd., MC K55/803, San Jose, 
CA 95120-6099, phone (408) 927-1838. 


1990 IEEE Workshop on VLSI Signal Pro¬ 
cessing, Nov. 7-9, San Diego, Calif. Contact 
Patti Fenstermacher, AT&T Bell Labs, 1243 S. 
Cedar Crest Blvd., Allentown, PA 18103, 
e-mail psf@aloft.att.com; or Howard S. 
Moscovitz, AT&T Bell Labs, 1243 S. Cedar 
Crest Blvd., Allentown, PA 18103, e-mail 
mosc@aloft.att.com. 


Int’l Workshop on Network and Operating 
System Support for Digital Audio and Vi¬ 
deo, Nov. 8-9, Berkeley. Calif. Sponsor: Int’l 
Computer Science Inst. Contact Ramesh Gov- 
indan, ICSI, 1947 Center St., Suite 600, Ber¬ 
keley, CA 94704-1105, phone (415) 642-4274, 
ext. 136, e-mail av-workshop@berkeIey.edu. 

Computational Science in Industry and the 
Comprehensive Univ., Nov. 8-10, Pomona, 
Calif. Sponsor: Calif. State Polytechnic Univ. 
at Pomona. Contact Bruce P. Hillam, Com¬ 
puter Science Dept., Calif. State Polytechnic 
Univ., 3801 W. Temple Ave., Pomona, CA 
91768. phone (714) 869-3440. 


Fourth Southeastern Small-College Com¬ 
puting Conf., Nov. 9-10, Hickory, N.C. Spon¬ 
sor: Consortium for Computing in Small Col¬ 
leges. Contact Susan Dean, Samford Univ., 
800 Lakeshore Dr., Birmingham, AL 35229, 
Bitnet stdean@samford.bitnet. 


ICCAD 90, IEEE Int’l Conf. on Com- 
vl? puter-Aided Design, Nov. 11-15, Santa 
Clara, Calif. Cosponsor: IEEE Circuits and 
Systems Society. Contact Pat Pistiili, MP As¬ 
sociates, 7490 Clubhouse Rd., Suite 102, 
Boulder, CO 80301, phone (303) 530-4562. 


Vision 90, Nov. 12-15, Detroit. Cosponsors: 
Society of Manufacturing Engineers and SME 
Machine Vision Assoc. Contact Lisa Macha- 
cki, Vision 90, SME Conf. Dept., PO Box 930, 
Dearborn, MI, phone (313) 271-1500, ext. 369. 

Supercomputing 90, Nov. 12-16, New 
N5? York City. Cosponsor: ACM. Contact 
Joanne L. Martin, IBM T.J. Watson Research 
Center, PO Box 218, Route 134, Yorktown 
Heights, NY 10598, phone (914) 945-3285, 
e-mail jlmart@ibm.com; or Supercomputing 
90, IEEE Computer Society, 1730 Massachu¬ 
setts Ave. NW, Washington, DC 20036-1903, 
phone (202) 371-1013. 


waii. Sponsor: State of Hawaii. Contact Wil¬ 
liam M. Ball, State of Hawaii, 300 Kahelu St., 
Suite 35, Mililani, HI 96789, phone (808) 625- 
5293. 


Fall Comdex, Nov. 13-17, Las Vegas. Contact 
Interface Group, 300 First Ave., Needham, 
MA 02194, phone (617) 449-6600. 


PRICAI 90, Pacific Rim Int’l Conf. on 
Artificial Intelligence 90, Nov. 14-16, 

Nagoya-shi, Aichi, Japan. Sponsor: Japanese 
Society for Artificial Intelligence et al. Con¬ 
tact Teruo Fukumura, Inter Group Corp., Aka- 
saka Yamakatsu Bldg., 8-5-32 Akasaka, Mi- 
nato-ku, Tokyo 107, Japan, phone (03) 479- 
5535. 


14th Western Educational Computing 
Conf., Nov. 15-16, Irvine, Calif. Sponsor: 
California Educational Computing Consor¬ 
tium. Contact Oliver Seely, Jr., California 
State Univ. at Dominguez Hills, Chemistry, 
1000 E. Victoria St., Carson, CA 90747. 

AIDA 90, Sixth Conf. on Artificial Intelli¬ 
gence and Ada, Nov. 15-16, Reston, Va. Spon¬ 
sors: George Mason Univ. et al. Contact AIDA 
90, Computer Science Dept., George Mason 
Univ., 4400 University Dr., Fairfax, VA 
22030, phone (703) 323-2713, fax (703) 323- 
2630, e-mail aida@gmuvax.gmu.edu. 


Cognitiva 90, Nov. 20-23, Madrid. 

Sponsor: AFCET. Contact Cognitiva 90, 
c/o Assoc. Francaise pour la Cybcmetique 
Economique et Technique, 156 Bd. Pereire, 
75017 Paris, France, phone 33 (1) 4766-2419, 
fax 33 (1) 4267-9312. 


Al 90, Australian Joint Artificial Intelli¬ 
gence Conf., Nov. 21-23, Perth, Western Aus¬ 
tralia. Sponsor: Australian Computer Society. 
Contact Les Kitchen, Univ. of Western Austra¬ 
lia, Computer Science Dept., Nedlands, West¬ 
ern Australia, 6009, phone 61 (9) 380-2281, 
e-mail ai90@wacsvax.oz.au. 

IFIP Workshop on Electronic Design 
NS? Automation Frameworks, Nov. 26-28, 

Charlottesville, Va. Sponsor: Int’l Federation 
for Information Processing. Contact Ron Wax- 
man, Univ. of Virginia, Thornton Hall, Char¬ 
lottesville, VA 22903, phone (804) 924-6086. 


IEEE 1990 Conf. on Software Mainte- 
VS7 nance, Nov. 26-29, San Diego, Calif. 
Contact Thomas M. Pigoski, USN, NSGD 
Pensacola, Corry Station, Pensacola, FL 
32511, phone (904) 452-6399. 


NIPS 90, IEEE Conf. on Neural Information 
Processing Systems, Nov. 26-29, Denver, 
Colo. Contact Kathie Hibbard, Engineering 
Center, Univ. of Colorado, Campus Box 425, 
Boulder, CO 80309-0425. 


44106, phone (216) 368-5277, e-mail cap@ 
alpha.ces.cwru.edu. 

Iecon 90,16th Conf. of the IEEE Industrial 
Electronics Society, Nov. 27-30, Pacific 
Grove, Calif. Contact Robert Begun, 23609 
Skyview Terr,, Los Gatos, CA 95030, phone 
(408) 353-1560. 

IAPR Workshop on Machine Vision Appli¬ 
cations, Nov. 28-30, Tokyo. Sponsor: Int’l As¬ 
soc. for Pattern Recognition. Contact Mikio 
Takagi, Inst, of Industrial Science, Univ. of 
Tokyo, 7-22-1 Roppongi, Minatoku, Tokyo 
106, Japan, phone 81 (3) 479-0289, fax 81 (3) 
423-2834, e-mail takagi@tkl.iis.u-tokyo.ac. 
jP' 


December 1990 

First Int’l Symp. on Uncertainty and 
N&7 Analysis: Fuzzy Reasoning, Probabil¬ 
istic Methods, and Risk Management, Dec. 

3-5, College Park, Md. Sponsors: Univ. of 
Maryland et al. Contact Bilal M. Ayyub, Civil 
Engineering Dept., Univ. of Maryland, Col¬ 
lege Park, MD 20742. 

ACM SIGSoft 90, Fourth Symp. on Software 
Development Environments, Dec. 3-5, Ir¬ 
vine, Calif. Sponsor: ACM. Contact Dewayne 
E. Perry, AT&T Bell Labs, 600 Mountain Ave., 
Murray Hill, NJ 07974, phone (201) 582-2529. 


Sixth Computer Security Applications 
Conf., Dec. 3-7, Tucson, Ariz. Sponsors: 
American Society for Industrial Security et al. 
Contact Marshall D. Abrams, Mitre Corp., 
7525 Colshire Dr., M/S Z269, McLean, VA 
22101, phone (703) 883-6938, e-mail 
abrams@mitre.org. 


Tri-Ada 90, Dec. 3-7, Baltimore, Md. Spon¬ 
sor: ACM. Contact Erhard Ploedereder, Tartan 
Labs, 300 Oxford Dr., Monroeville, PA 15146, 
phone (412) 856-3600, fax (412) 856-3636, 
e-mail ploedere@tartan.com or ploedere@ 
ajpo.sei.cmu.edu 


ICCV 90, Third Int’l Conf. on Com- 
N5? puter Vision, Dec. 4-7, Osaka, Japan. 
Contact ICCV 90, IEEE Computer Society, 
1730 Massachusetts Ave. NW, Washington, 
DC 20036-1903, phone (202) 371-1013. 


SEARCC 90, South East Asia Regional 
Computer Confederation Conf., Dec. 4-8, 

Manila. Sponsor: Philippine Computer Soci¬ 
ety. Contact Victor B. Gruet, Computer Infor¬ 
mation Systems, CIS Bldg., Meralco Com¬ 
pound, Ortigas Ave., 1602 Pasig, Metro Ma¬ 
nila, Philippines, phone 63 (2) 722-1251, fax 
63 (2) 722-0141. 


/gjv Micro 23, 23rd Symp. and Workshop 
on Microprogramming and Micro¬ 
architecture, Nov. 27-29, Orlando, Fla. Co¬ 
sponsor: ACM. Contact Chris Papachristou, 
Case Western Reserve Univ., Computer Engi¬ 
neering and Science Dept., Cleveland, OH 


llth Real-Time Systems Symp., Dec. 
5-7, Orlando, Fla. Sponsor: IEEE Com¬ 
puter Society Technical Committee on Real- 
Time Computing. Contact Doug Locke, IBM 
— MS 409, Systems Integration Div., 6600 


120 


COMPUTER 






CASE 90, Fourth Int’l Workshop on 
Computer-Aided Software Engineer¬ 
ing, Dec. 5-8, Irvine, Calif. Contact Elliott J. 
Chikofsky, Radius Systems, 75 Lexington St., 
Burlington, MA 01803, phone (617) 494- 
8200. 

WSC 90, 1990 Winter Simulation 
vi? Conf., Dec. 9-12, New Orleans. Contact 
Randall P. Sadowski, Systems Modeling 
Corp., 504 Beaver St., Sewickley, PA 15143, 
phone (412) 741-3727. 

© Second IEEE Symp. on Parallel and 
Distributed Processing, Dec. 9-12, 

Dallas. Cosponsor: Dallas Chapter of the IEEE 
Computer Society. Contact Behrooz Shirazi, 
Computer Science Dept., Southern Methodist 
Univ., 6425 Airline Rd., Dallas, TX 75275- 
0122, phone (214) 692-2874, e-mail 
shirazi%smu.uucp@ uunet.uu.net. 

San Diego Workshop on Volume Visu- 
alization, Dec. 10-12, La Jolla, Calif. 
Cosponsor: ACM. Contact T. Todd Elvins, 
SDSC, Box 85608, San Diego, CA 92038, 
phone (619) 534-5128. 

ICDT 90, Third Int’l Conf. on Database The¬ 
ory, Dec. 11-15, Paris. Sponsor: INRIA. Con¬ 
tact INRIA, Domaine de Voluceau — Roc- 
quencourt, BP 105, 78153 Le Chesnay Cedex, 
France, phone 33 (1) 3963-5500, fax 33 (1) 
3963-5638. 

10th Conf. on Foundations of Software 
Technology and Theoretical Computer Sci¬ 
ence, Dec. 17-19, Bangalore, India. Contact 
Y.N. Srikant, Indian Inst, of Science, Banga¬ 
lore 560 012, India, phone (812) 334-411. 

1990 IEEE Workshop on Languages 
and Architectures for Automation, 
Dec. 19-21, Honolulu, Hawaii. Sponsors: Pa¬ 
cific lnt’1 Center for High Technology Re¬ 
search et al. Contact D.Y.Y. Yun, Univ. of Ha¬ 
waii, 711 Kapiolani Blvd., Suite 200, Hono¬ 
lulu, HI 96813-5249, phone (808) 539-1532, 
fax (808) 941-1399; or Shi-Kuo Chang, 322 
Alumni Hall, Univ. of Pittsburgh, Pittsburgh, 
PA 15260, phone (412) 624-8493, fax (412) 
624-8465, e-mail chang@vax.cs.pitt.edu. 


phone (303) 491-7031, fax (303) 491-2293, 
e-mail malaiya@ravi.cs.colostate.edu; or D. 
Roy Chowdhury, Gateway Design Automa¬ 
tion, SDF#A-1, Noida Export Processing 
Zone, PO NEPZ, Noida 201305, India, phone 
91 (05736) 62342, fax 91 (05736) 62343. 

SIAM Workshop on Automatic Differentia¬ 
tion of Algorithms, Jan. 7-9, Breckenridge, 
Colo. Contact Society for Industrial and Ap¬ 
plied Mathematics, Conf. Coordinator, Dept. 
CC0590, 3600 University City Science Center, 
Philadelphia, PA 19104-2688, phone (215) 
382-9800, fax (215) 386-7999, e-mail 
siamconfs@wharton.upenn.edu. 

|£jii Int’l Conf. on Multimedia Informa- 
vftx tion Systems, Jan. 16-18, Singapore. 
Contact Juzar Motiwalla, Inst, of Systems Sci¬ 
ence, Nat’l Univ. of Singapore, Heng Mui 
Keng Terr., Kent Ridge, Singapore 0511, 
phone (65) 772-2075. 

Int’l Workshop on Unix-Based Software 
Development Environments, Jan. 16-18, 

Dallas, Texas. Sponsor: Usenix Assoc. Con¬ 
tact Usenix Conf. Office, 22672 Lambert St., 
Suite 613, El Toro, CA 92630, phone (714) 
588-8649 

PADS, Workshop on Parallel and Dis- 
tributed Simulation, Jan. 21-23, Ana¬ 
heim, Calif. Cosponsors: ACM, SCS. Contact 
David M. Nicol, Computer Science Dept., Col¬ 
lege of William and Mary, Williamsburg, VA 
23185, phone (804) 221-3458, e-mail nicol@ 


Second ACM-SIAM Symp. on Discrete Al¬ 
gorithms, Jan. 28-30, San Francisco, Calif. 
Contact SIAM Conf. Coordinator, Dept. 
CC0590, 3600 University City Science Center, 
Philadelphia, PA 19104-2688, phone (215) 
382-9800, fax (215) 386-7999, e-mail 
siamconfs@ wharton.upenn.edu. 

IEEE Int’l Conf. on Wafer Scale Inte- 

gration, Jan. 29-31, San Francisco, 
Calif. Cosponsors: IEEE Components, Hy¬ 
brids, and Manufacturing Technology Soci¬ 
ety. Contact Terry Chappell, 730 Encino Dr., 
Aptos, CA 95003, phone (408) 662-1936; or R. 
Mike Lea, Brunei Univ., Uxbridge UB8 3PH, 
UK, phone (44) 895-74000, ext. 2821, fax (44) 
895-58728, e-mail mike.lea@brunel.ac.uk. 


ISSCC 91, 1991 IEEE Int’I Solid-State Cir¬ 
cuits Conf., Feb. 13-15, San Francisco, Calif. 
Sponsors: IEEE Solid-State Circuits Council 
et al. Contact Diane Suiters, Courtesy Associ¬ 
ates, 655 15th St. NW, Suite 300, Washington, 
DC 20005, phone (202) 639-4255. 

PCCS 1, First Int’l Workshop on Performa- 
bility Modeling of Computer and Communi¬ 
cation Systems, Feb. 18-19, Enschede, The 
Netherlands. Contact Nico M. van Dijk, Free 
Univ., Faculty of Economics, PO Box 7161, 
1007 MC Amsterdam, The Netherlands, phone 
31 (20) 548-7061, fax 31 (20) 462-645, e-mail 
ectricvu@sara.nl. 

CAIA 91, Seventh IEEE Conf. on Arti- 
ficial Intelligence Applications, Feb. 
24-28, Miami Beach, Fla. Contact IEEE Com¬ 
puter Society, 1730 Massachusetts Ave. NW, 
Washington, DC 20036-1903, phone (202) 
371-1013. 

Fourth Topical Meeting on Robotics and 
Remote Systems for Hazardous Environ¬ 
ments, Feb. 24-28, Albuquerque, N.M. Con¬ 
tact Raymond W. Harrigan, Div. 1414, Sandia 
Nat’l Labs, Albuquerque, NM 87185, phone 
(505) 846-6278, fax (505) 846-7425. 

EDAC 91, European Design Automa- 
tion Conf.; Feb. 25-28, Amsterdam. 
Sponsor: Institution of Electrical Engineers. 
Contact Secretariat, EDAC 91, CEP Consult¬ 
ants, 26-28 Albany St., Edinburgh EH1 3QH, 
Scotland, phone 44 (31) 557-2478, fax 44 (31) 
5,57-5749. 

ff Compcon Spring 91, Feb. 25-Mar. 1, 
San Francisco. Contact Compcon Spring 
91, IEEE Computer Society, 1730 Massachu¬ 
setts Ave. NW, Washington, DC 20036-1903, 
phone (202) 371-1013. 


March 1991 

Fifth Int’l Workshop on High-Level 
Synthesis, Mar. 3-6, Buhlerhohe, West 
Germany. Cosponsors: IEEE et al. Contact 
Raul Camposano, IBM T.J. Watson Research 
Center, PO Box 218, Yorktown Heights, NY 
10598, phone (914) 945-3871, e-mail raulc@ 


Seventh Israeli Conf. on Artificial Intelli¬ 
gence and Computer Vision, Dec. 26-27, Tel 

Aviv. Contact A. Bruckstein, Faculty of Com¬ 
puter Science, Technion, 32000 Haifa, Israel, 
e-mail freddy@techsel.bitnet; or Shmuel Pe- 
leg, David Samoff Research Center, CN 5300, 
Princeton, NJ 08543-5300, phone (609) 734- 
2284, e-mail peleg@vision.sarnoff.com. 


January 1991 


£j)j) Fourth CSI/IEEE Int’l Symp. on VLSI 
Design, Jan. 5-8, New Delhi. Sponsors: 
Computer Society of India et al. Contact Yash- 
want K. Malaiya, Computer Science Dept., 
Colorado State Univ., Fort Collins, CO 80523, 


February 1991 

Systems/USA Technology Conf., Feb. 11-13, 

Anaheim, Calif. Sponsor: American Electron¬ 
ics Assoc. Contact AEA, 5201 Great America 
Pkwy., Santa Clara, CA 95054, phone (503) 
359-5873 or (408) 987-4204, fax (503) 357- 
3839 or (408) 970-8565. 

Fifth Int’l Conf. on Modeling Techniques 
and Tools for Computer Performance Eval¬ 
uation, Feb. 13-15, Torino, Italy. Contact Ma¬ 
ria Carla Calzarossa, Dip. di fnformatica e Sis- 
temistica, Univ. di Pavia, Via Abbiategrasso, 
209, 27100 Pavia, Italy, phone 39 (382) 391 - 
350, fax 39 (382) 422-881, e-mail mcc@ 
ipvpel.infn.it. 


tffjl Fourth Computer Virus and Security 
Conf., Mar. 14-15, New York City. 
Sponsor: Data Processing Management Assoc. 
Financial Industries. Contact Judy S. Brand, 
PO Box 6313, FDR Station, New York, NY 
10150, phone (800) 835-2246. 


Third IEE Conf. on Telecommunications, 
Mar. 17-20, Edinburgh, Scotland. Sponsor: 
Inst, of Electrical Engineers. Contact Conf. 
Services, IEE, Savoy Place, London WC2R 
0BL, UK, phone 44 (71) 240-1871, fax 44 (71) 
240-7735. 


Symp. on Experiences with Distrib- 
uted and Multiprocessor Systems, 
Mar. 21-22, Atlanta. Sponsor: Usenix Assoc. 
Contact George Leach, AT&T Paradyne, MS 
LG-129, PO Box 2826, Largo, FL 34649-2826, 


August 1990 


121 









phone (813) 530-2376, e-mail reggie@pdn. 
paradyhe.com. 

Fifth SIAM Conf. on Parallel Processing 
and Scientific Computing, Mar. 25-27, 

Houston. Contact Society for Industrial and 
Applied Mathematics Conf. Coordinator, 
Dept. CC0590, 3600 University City Science 
Center, Philadelphia, PA 19104-2688, phone 
(215) 382-9800, fax (215) 386-7999, e-mail 
siamconfs@wharton.upenn.edu. 


April 1991 


24th Computer Simulation Conf., Apr. 1-5, 

New Orleans. Sponsor: Society for Computer 
Simulation. Contact George W. Zobrist, Com¬ 
puter Science Dept., Univ. of Missouri at 
Rolla, Rolla, MO 65401, phone (314) 341 - 
4836, e-mail c2816@umrvmb.umr.edu. 

Dasfaa 91, Second Int’l Symp. on Data 
vip' base Systems for Advanced Applica¬ 
tions, Apr. 2-4, Tokyo. Sponsor: Information 
Processing Society of Japan. Contact Yahiko 
Kambayashi, Computer Science Dept., Kyu¬ 
shu Univ., 6-10-1 Hakozaki, Higashi Fukuoka 
812, Japan, phone 81 (92) 641-1101, ext. 5407; 
or Yoshifumi Masunaga, Univ. of Library and 
Information Science, 1-2 Kasuga, Tsukuba, 
Ibaraki 305, Japan, phone 81 (298) 52-0511, 
ext. 340, fax 81 (298) 52-4326, e-mail 
masunaga@ulis.ac.jp. 

IEEE Infocom 91, Conf. on Computer 

Communications, Apr. 7-11, Miami, 
Fla. Cosponsors: IEEE Computer and Commu¬ 
nications Societies. Contact N. Shacham, 

IEEE Infocom 91, SRI Int’l, 333 Ravenswood 
Ave., Menlo Park, CA 94025, phone (415) 859- 
5710, e-mail shacham@sri.com. 

/£§^jj IMS 91, First IEEE Int’l Workshop on 

Interoperability in Multidatabase 
Systems, Apr. 8-9, Kyoto, Japan. Contact Ah¬ 
med K. Elmagarmid, Purdue Univ., Computer 
Sciences Dept., West Lafayette, IN 47907, 
phone (317) 494-1998; or Yutaka Matsushita, 
Instrumentation Dept., Keio Univ., Hiyoshi, 
Yokohama, Japan, phone 81 (44) 63-1141, ext. 
3564. 

ASPLOS 4, Fourth Int’I Conf. on 

Architectural Support for Program¬ 
ming Languages and Operating Systems, 
Apr. 8-11, Santa Clara, Calif. Sponsor: ACM. 
Contact Bob Rau, Hewlett-Packard Labs, 1501 
Page Mill Rd., Bldg. 3U, Palo Alto, CA 94304, 
fax (415) 857-8558, e-mail rau@hplabs.hp. 


Seventh Int’l Conf. on Data Engineer 
ing, Apr. 8-12, Kobe, Japan. Contact 
Ming T. (Mike) Liu, Computer and Informa¬ 
tion Science Dept., Ohio State Univ., 2036 Neil 
Ave., Columbus, OH 43210-1277, phone 
(614) 292-1837, e-mail Iiu@cis.ircc.ohio- 
state.edu; or Data Engineering 91, IEEE Com¬ 
puter Society, 1730 Massachusetts Ave. NW, 
Washington, DC 20036-1903, phone (202) 
371-1013, fax (202) 728-0884. 


Ninth IEEE VLSI Test Symp., Apr. 16- 

18, Atlantic City, N.J. Cosponsor: IEEE 
Philadelphia Section. Contact Mukund Modi, 
Naval Air Engineering Center, ATE Software 
Center, Code: 52514, Lakehurst, NJ 08733, 
phone (201) 323-7002, fax (301) 323-7445. 

ETC 91,1991 European Test Conf., 
Apr. 16-19, Munich, West Germany. 
Sponsor: VDE (Zentralstelle Tagungen und 
Seminare). Contact Peter Stilke, VDE, Strese- 
mannallee 15, D-6000 Frankfurt 70, West Ger¬ 
many, phone (69) 6308-203, fax (69) 6308- 
273. 

( ^j^l CHDL 91,10th Int’l Symp. on Com- 
puter Hardware Description Lan¬ 
guages and their Applications, Apr. 22-24, 

Marseille, France. Cosponsors: International 
Federation for Information Processing et al. 
Contact Dominique Borrione, Imag/Artemis, 
BP 53X, 38041 Grenoble Cedex, France, 
phone (33) 7651-4604, ext. 5240, fax (33) 
7651-9637, e-mail borrione@imag.imag.fr. 

Second Int’l Conf. on Systems Integra- 
tion, Apr. 22-25, Morristown, N.J. Co¬ 
sponsors: New Jersey Inst, of Technology et al. 
Contact Peter A. Ng, Computer and Informa¬ 
tion Science Dept., New Jersey Inst, of Tech¬ 
nology, University Heights, Newark, NJ 
07102, phone (201) 596-3387, e-mail ng_p@ 
vienna.njit.edu. 

NCGA 91, 1991 National Computer Graph¬ 
ics Assoc. Conf., Apr. 22-25, Chicago. Con¬ 
tact NCGA, 2722 Merrilee Dr., Suite 200, 
Fairfax, VA 22031, phone (703) 698-9600. 

CHI 91,1991 Conf. on Human Factors 
vfty in Computing Systems, Apr. 27-May 2, 

New Orleans. Sponsor: ACM. Contact Keith 
Butler, Boeing, Advanced Technology Center, 
PO Box 24346, M/S 7L-64, Seattle, WA 98124, 
phone (206) 865-3389; or June Davis, 13 An¬ 
napolis St., Annapolis, MD 21401, phone 
(301) 269-6801. 


May 1991 


terns. Contact Luis-Felipe Cabrera, IBM Al- 
maden Research Center, MC K55/801, 650 
Harry Rd., San Jose, CA 95120-6099, phone 
(408) 927-1838, e-mail cabrera@ibm.com; or 
Kenneth Kane, Boston Development Center. 
Sun Microsystems, 2 Federal St., Billerica, 
MA 01802, phone (508) 671-0367, e-mail 
kkane@ east.sun.com. 

ICDCS 91, 11th Int’l Conf. on Distrib- 
uted Computing Systems, May 20-24, 

Arlington, Texas. Contact Bill D. Carroll, 
Computer Science Dept., Engineering, Univ. 
of Texas at Arlington, Box 19015, Arlington, 
TX 76019-0015, phone (817) 273-3785, 
e-mail carroll@evax.ari.utexas.edu. 

SESAW, Fourth Software Engineer- 
ing Standard Application Workshop, 
May 21-23, San Diego, Calif. Contact Vera V. 
Edelstein, Nynex, 500 Westchester Ave., 
White Plains, NY 10604, phone (914) 683- 
2888. 

/£3^i ISCA 18,18th Int’l Symp. on Com- 
uter Architecture, May 27-30, 

Toronto, Canada. Cosponsor: ACM. Contact 
K.C. Smith, Univ. of Toronto, Electrical Engi¬ 
neering Dept., Toronto, Ont. M5S 1A4, Can¬ 
ada, phone (416) 978-5033. 


June 1991 


/gji ICSE 13,13th Int’l Conf. on Software 
vU' Engineering, May 13-17, Austin, 

Texas. Cosponsor: ACM. Contact ICSE 13, 
Bryan Fugate, MCC, 3500 W. Balcones Center 
Dr., Austin, TX 78759-6509, phone (512) 338- 
3330; MCC, PO Box 200015, Austin, TX 
78720-0015; or ICSE 13, IEEE Computer So¬ 
ciety, 1730 Massachusetts Ave. NW, Wash¬ 
ington, DC 20036-1903, phone (202) 371- 
1013. 

® CompEuro 91, IEEE Int’l Conf. on 
Advanced Computer Technology, Re¬ 
liable Systems, and Applications, May 13- 

17, Bologna, Italy. Cosponsors: IEEE Region 8 
et al. Contact Vito Monaco, Dip. Eletronica In- 
formatica E Sistemistica, Univ. Di Bologna, 
Viale Risorgimento, 1-60136, Bologna, Italy. 

^ CCW 91, Third IEEE Conf. on Com- 
vU' puter Workstations, May 15-17, Fal¬ 
mouth, Mass. Sponsor: IEEE Computer Soci¬ 
ety Technical Committee on Operating Sys- 


tftwk Fourth Int’l Conf. on Industrial and 
vfty Engineering Applications of Artificial 
Intelligence and Expert Systems, June 2-5, 

Kauai, Hawaii. Sponsors: ACM et al. Contact 
Moonis Ali, Univ. of Tennessee Space Inst., 
MS15, B.H. Goethert Pkwy., Tullahoma, TN 
37388-8897, phone (615) 455-0631, ext. 236, 
fax (615) 454-2354, e-mail alif@utsivl.bitnet. 

(jgj!) CVPR 91, IEEE Computer Society 

Conf. on Computer Vision and Pattern 
Recognition, June 3-7, Lahaina, Maui, Ha¬ 
waii. Contact Shahriar Negahdaripour, Elec¬ 
trical Engineering Dept., Univ. of Hawaii at 
Manoa, 2540 Dole St., Honolulu, HI 96822, 
e-mail shahriar@wiliki.eng.hawaii.edu. 

SCM 3, Third Int’l Software Configu- 
v5 v ration Management Workshop, June 
12-14, Trondheim, Norway. Cosponsors: 
ACM, et al. Contact Reidar Conradi, Computer 
Systems and Telematics Div., Norwegian Inst, 
of Technology, N-7034 Trondheim, Norway, 
phone 47 (7) 593-444; or Peter Feiler, Software 
Engineering Inst., Carnegie Mellon Univ., 
Pittsburgh, PA 15213-3890, phone (412) 268- 
7790, e-mail phf@sei.cmu.edu. 

DAC 91, 28th ACM/IEEE Design 
Automation Conf., June 16-21, 

Orlando, Fla. Cosponsor: ACM. Contact Pat 
Pistilli, MP Associates, 7490 Clubhouse Rd., 
Suite 102, Boulder, CO 80301, phone (303) 
530-4333. 

10th Symp. on Computer Arithmetic, 
June 26-28, Grenoble, France. Cospon¬ 
sors: ACM et al. Contact Jean-Michel Muller, 
Lab. LIP-IMAC, Ens. Lyon, 69364 Lyon 
Cedex 07, France, phone 33 (72) 72-8229. 


122 


COMPUTER 








IEEE COMPUTER SOCIETY 
Membership / Subscription Application 



BENEFITS 



Computer 

Computer comes automatically 
with membership. Written, 
reviewed, and refereed by 
experts, it features survey and 
tutorial articles covering the 
entire computer field, and 
departments such as new 
products, new product reviews, 
standards, and a reader forum 
called "The Open Channel." 
(monthly). 


Technical Committees 

Participate in one or more of our 33 technical 
committees — networks of professionals with common 
interests in specialty areas within computer hardware, 
software, and applications. 

Standards Working Groups 
Participate in the development of the more than 100 
standards projects currently sponsored by the society 
in such diverse areas as software engineering, local 
area networks, microprocessor buses, design automa¬ 
tion, programming languages, and standards 
definitions. 

Computer Society Press Books 

Receive discounts of up to 50% on over 600 titles 
covering a broad spectrum of computer science topics 
such as networking, communications, advanced 
systems, image processing, security, artificial 
intelligence, and design automation. Over 60 new titles 
are published annually. 

Conferences and Tutorials 
Choose from more than 100 conferences annually, 
ranging from large industry-oriented conferences 
replete with exhibits to small, highly interactive 
workshops. Members receive special low rates. 


Schedule of Fees 


To join: see item 1, 2, or 3. 

To subscribe: see item 4. 

Membership dues and periodical subscriptions are annualized to, and expire on, 
December 31. Choose full- or half-year rate schedules depending on date of 
receipt by the Computer Society as indicated below. Half Year Full Year 
Mar 1-Aug 31 Sept 1-Feb 28 

I I don’t belong to the IEEE and I want □ $23.50 □ $47.00 

to join just the Computer Society 

2 1 don’t belong to the IEEE and I want 

to join both the Computer Society and the IEEE* 

I reside in Region 1 -6 (United States). □ $47.50 

I reside in Region 7 (Canada). □ $43.50 


I reside in Region 8 (Europe, Africa, orthe Middle East) □ $43.00 


in Region 9 (Latin America)... 
in Region 10 (Asia and Pacific)... 


□ $39.50 

□ $38.50 


□ $95 

□ $87 

□ $86 

□ $79.00 

□ $77.00 


lh IEEE and the Computer Society may deduct $5 off th 


□ $ 9.00 □ $18.00 


3 1 already belong to the IEEE and I want 
to join the Computer Society. 

IEEE Member Number_ 

4 OPTIONAL PERIODICALS for new or current members 

issues per year 

IEEE Computer Graphics and Applications (3061) 6 □ $10.00 

IEEE Design and Test (3111) .6 □ $10.50 

IEEE Expert (3151) .6 □ $ 9.00 

IEEE Micro (3071) .6 □ $ 9.50 

IEEE Software (3121) .6 □ $10.00 

Transactions on Computers (1161) .12 □ $10.00 

Transactions on Knowledge and 

Data Engineering (1471) .4 □ $ 5.00 

Transactions on Parallel and 

Distributed Systems (1501) .4 □ $ 5.50 

Transactions on Pattern Anaysis and 

Machine Intelligence (1351) .12 □ $10.00 

Transactions on Software Engineering (1171) .12 □ $10.00 

Total amount remitted with this application $ 

□ Checks are accepted in Belgian, British, German, Swiss, Japanese, oi 
U.S. currencies. U.S. checks must be drawn on a U.S. bank. 

□ Visa □ Master Card □ American Express □ Eurocard Mo | Yr ' 

.... m 1111 ii mr 

Charge Card Number Exp. Date 


□ 

$10.00 

□ 

$20.00 

□ 

$10.50 

□ 

$21.00 

□ 

$ 9.00 

□ 

$18.00 

□ 

$ 9.50 

□ 

$19.00 

□ 

$10.00 

□ 

$20.00 

□ 

$10.00 

□ 

$20.00 

□ 

$ 5.00 

□ 

$10.00 

□ 

$ 5.50 

□ 

$11.00 

□ 

$10.00 

□ 

$20.00 

□ 

$10.00 

□ 

$20.00 


PRICES EXPIRE 12/31/90 


je governed by IEEE’s and the society’s c 


s, bylaws, and statements of 


MAILING ADDRESS 


Name of educational institution 

OCCUPATION __ 


REFERENCE (an IEEE member; if unknown, a managerial person who knows you professionally) 


City Stale/Country Zip City StatafCountry Zip 


Return to: IEEE Computer Society, 10662 Los Vaqueros Circle, P.O. Box 3014 Los Alamitos, CA 90720-1264 USA. PC 

Residents of Europe mail to: IEEE Computer Society, 13, Avenue de I’Aquilon, B-1200, Brussels, BELGIUM. 

Asian / Pacific residents mail to: IEEE Computer Society, Ooshima Building, 2-19-1 Minami-Aoyama, Minato-ku, Tokyo 107 JAPAN. 


888 













































BOOK REVIEWS 


Editor: Guy Johnson, Department of Information Technology, Rochester Institute of Technology, 1 Lomb Memorial Drive, Rochester, NY 14623. 


Neural Computing: Theory and Practice 

Philip D. Wasserman (Van Nostrand Reinhold, New York, 1989, 230 pp., $36.95) 


I first heard about this book during 
IEEE’s 1989 video conference on neural 
networks, where one of the panelists 
highly recommended it. The book’s 10 
chapters span the most popular and well- 
known topics and paradigms in neural 
networks, including the fundamentals of 
artificial neural networks, perceptrons, 
back propagation, counterpropagation 
networks, statistical methods, Hopfield 
nets, bidirectional associative memories, 
adaptive resonance theory, optical neural 
networks, cognition, and neocognition. 
Three appendixes cover such topics as 
biological neural networks, vector and 
matrix operations, and training algo¬ 
rithms. 

The book is well suited to both novice 
readers and those with a fair background 
in neural networks. I also recommend it 
as a text for an undergraduate course 
covering the basic topics and algorithms 


in neural networks, although it would 
need to be augmented with exercises and 
problem sets. 

One good point about this book is that 
it does not have to be read from cover to 
cover. Each chapter is self-contained, 
making this an excellent reference work. 
The book covers each algorithm, para¬ 
digm, or topic in plain text followed by 
mathematical formulas and algorithms, 
making the material much simpler to un¬ 
derstand. 

Interested readers with a background 
in first-year calculus should find the book 
reasonable and helpful for self-instruc¬ 
tion. The book has enough detail that a 
reader who has taken a one-semester 
computer programming course should be 
able to implement the paradigms easily 
on a general-purpose computer. This 
book is not intended to advance the state 
of the art, and it is not very appropriate 


for experienced neural-network research¬ 
ers, although they might find the refer¬ 
ence lists in each chapter helpful. 

The book’s style, organization, and 
continuity are all very good. Its strong 
point is that it cuts through the jargon 
and presents concepts in a format that’s 
easy to understand. Its weak point is that 
it does not offer enough information on 
paradigm applications and the kinds of 
problem domains best suited to specific 
paradigms. The book is intended for nov¬ 
ices and application-oriented people, 
who are usually most interested in apply¬ 
ing the technology and finding good ap¬ 
plications. Therefore, a section on how to 
identify suitable applications and the 
characteristics of such problems would 
have been extremely useful. 

Mina Akhavi 

Commercial Programming Systems 


Computer Modeling for Discrete Simulation 

Michael Pidd, ed. (John Wiley and Sons, New York, 1989, 274 pp., $54.95) 


Some book reviews are simple; you 
only have to write a single word: “good,” 
“excellent,” “bad,” etc. Other times, 
things are more complicated, and more 
words are required: “very good” or “quite 
bad.” I tend to use my own terminology 
for book reviews. Computer Modeling for 
Discrete Simulation is a “tasteful” book. 

Pidd and the other authors are all sci¬ 
entists and academic teachers in manage¬ 
ment science, gathered here to investi¬ 
gate the relationship between computer 
simulation and management science. 
Everyone with a general, elementary 
background in simulation — computer 
scientists, production engineers, software 
designers, physicists, mathematicians, 
system analysts, etc. — could read this 
book and find it useful. Even readers 
with no real ideas about simulation will 
get some. 

Unfortunately, the book is neither a 


course text nor a manual for self instruc¬ 
tion. It is a source book with a great list 
of references and cross references on 
computer modeling, simulation, and 
management methods. Although most of 
the references are quite stimulating for 
further study and experimentation, the 
book can be read and understood without 
reaching for them. 

The book covers major topics in mod¬ 
eling, simulation, and management, such 
as simulation implementation, proper 
languages, supporting facilities, man- 
machine interfaces, artificial intelligence, 
and expert systems. 

Pidd’s team saves the best for last — a 
brilliant debate between Pidd and J.G. 
Crooks. Crooks claims C is better than 
Pascal for simulation, while Pidd dis¬ 
agrees. (I think Crooks has a point, but 
that’s my own opinion; I hate Pascal for 
various reasons.) 


The book does not consider desktop 
applications, but they might be beyond 
its scope, since the tone is more like that 
of a seminar book or the proceedings of a 
high-quality conference. 

If you are interested in computer mod¬ 
eling, you will be interested in this book. 
I found the writing very smooth and easy 
to read, especially considering that Pidd 
had to make five authors follow a com¬ 
mon style of expression. The printing 
and quality are excellent, and there is a 
full index and appendixes containing 
many examples. You can also buy an 
MS-DOS floppy disk with C and Pascal 
source code for two of the examples 
(you’ll have to supply the compilers 
yourself). 


Leonidas J. Irakliotis 
Miami University, Oxford, Ohio 


124 


COMPUTER 








DB2 SQL: A Professional Programmer’s Guide 

Tim Martyn and Tim Hartley (Intertext, New York, 1989, 433 pp., $39.95) 


DB2 SQL: A Professional Program¬ 
mer’s Guide is misnamed for two rea¬ 
sons. First, it is an excellent introduction 
to Structured Query Language for end 
users — computer-literate, non-informa¬ 
tion-systems professionals who do much 
of their own on-line data analysis — as 
well as for professional programmers and 
analysts. Second, it covers SQL/DS as 
well as DB2 SQL, and it does a good job 
of highlighting differences between them. 

The book focuses on the Select state¬ 
ment, which is the SQL statement most 
used by end-users and most often execut¬ 
ed interactively through SPUFI, QMF, 
ISQL, and SQLDBSU for on-line data 
extraction, analysis, and report writing. It 
is also the area where DB2 SQL and 
SQL/DS are most similar. 

The Select statement might seem a 
narrow topic for a text, but the list of Se¬ 
lect’s features and uses is long. The book 
discusses built-in functions (scalar and 
column), sorting, Boolean connectors, 
pattern matching, arithmetic expressions, 
date and time data, null values, data type 
conversion, join and union operations, 
subqueries and correlated subqueries, 
logical versus “as they-appear” operation 
sequences, and views. 

Other topics are covered to the extent 
they affect aspects of Select statements. 
For example, a section on Create Table 
concentrates on detailed discussion of 
SQL data types. A good understanding of 
data types is necessary to successfully 
use a number of functions and join oper¬ 
ations. Other topics include entity and 
referential integrity, indexes, data manip¬ 
ulation, and database security. The au¬ 
thors are careful to specify that the non- 
Select statements are generally best 
executed in a carefully controlled envi¬ 
ronment that uses embedded SQL. 

The book’s structure makes it easy to 
read straight through. Most new material 
gets a brief introduction, an example, 
numbered comments that expand on the 
new material, and exercises (with an¬ 
swers in an appendix). Comments men¬ 
tion common errors and misconceptions, 
specify system responses, and expand on 
the syntax. Most sections of new material 
are only two or three pages long and be¬ 
gin on a new page, making it easy to fol¬ 
low and reference. 

The structure also makes this an excel¬ 
lent text for professional training ses¬ 
sions, and the short topics and numbered 
comments make it easy to use in lectures. 
Indeed, most of the book is a compilation 
of handouts developed by the authors for 
workshops and seminars (in fact, one ex¬ 


ample still contains a workshop refer¬ 
ence). This is well-tested, well-structured, 
and well-produced training material. 

Academics might also be interested in 
this volume, though they are likely to 
find the focus too narrow. The book does 
not offer a general and comprehensive 
presentation of SQL; the discussions of 
Update, Insert, and Delete are short; 
there is little discussion of locking and 
associated transaction-oriented consider¬ 
ations; and there is no attempt to present 
detailed system or database design issues 
or principles. The book’s purpose is to 
describe how to effectively use selected 
features of specific commercial products 
(DB2 SQL and SQL/DS). 

The book’s highlighting of the differ¬ 
ences between DB2 SQL and SQL/DS 
are helpful for those programmers and 
end users who must deal with both prod¬ 
ucts or who are moving from one to the 
other. However, the book does not dis¬ 
cuss differences in those areas that would 
be of most interest to database adminis¬ 
trators and systems personnel, the levels 
where DB2 SQL and SQL/DS are most 
dissimilar. 


I’ve been using Upgrading and Re¬ 
pairing PCs for almost a year now, and 
it’s become my PC bible. I can’t think of 
a problem I’ve faced that isn’t discussed 
in this book. 

If you’ve got a sticky problem, such as 
whether your expensive monitor will 
match up with that economical display 
adapter, this book will answer it for you. 
If your computer needs repair, this book 
will take you from the top of your prob¬ 
lem to the bottom. If you need help 
choosing a hard drive, you’ll find all the 
pros and cons here. 

Consider the book’s treatment of hard 
drives. After giving background on hard 
drives, the author lists and explains each 
component and presents a clear, com¬ 
plete illustration pointing out where all 
the components are located. Hard-drive 
terminology is discussed in enough depth 
for you to interpret everything on the 
manufacturer’s spec sheet. The author 
also discusses little-known repairs, such 
as replacing the hard drive’s logic board. 
The book lists the various features of 
each type of hard drive and lists parame¬ 
ters for each model. The discussion in- 


Appendixes provide succinct introduc¬ 
tions to SPUFI, QMF and ISQL. SQLD¬ 
BSU, roughly the SQL/DS version of 
SPUFI, is not covered. 

The book refers to performance con¬ 
siderations only obliquely. An explicit 
warning concerning processing time for 
views would be helpful to end users, who 
are often unaware that processing time 
for a Select statement run against a view 
includes any processing time required for 
generating the view’s data. 

Also, in a professional environment, 
Creators are used from day one, yet the 
book reserves its discussion for the next- 
to-last chapter (the primary topic of 
which is database security). This discus¬ 
sion could easily (and should) be moved 
up to the chapter on synonyms. 

These are minor complaints, though. 
The book is very readable and contains 
everything (except installation-specific 
details) needed to learn a broad and pow¬ 
erful range of Select statement features 
in an interactive environment. 

Dan Wahl 

Vector Research 


eludes controller data, interleave factors, 
transfer rates, configuration, formatting, 
partitioning, and repair information. 

The book’s real asset is its detailed ex¬ 
planation of each piece of hardware. But 
it isn’t enough to know how the parts 
work; each system’s options and the 
components with which it is compatible 
can only be understood fully from the 
perspective of the PC’s evolution. The 
author addresses this by including a 
chapter on the history of PCs. 

Upgrading and Repairing PCs also 
discusses PC architecture and system 
memory, and the 70-page appendix is 
full of practical reference information, 
such as DOS error codes, I/O addresses, 
interrupts, and memory maps. 

The $27.95 price might seem a little 
steep, but it’s justified considering that 
the book’s 700 pages give you a complete 
source of PC information and could help 
you avoid costly repairs. You can save 
hours of shopping for upgrades by read¬ 
ing this book before you step into a store. 

D.C. Bissell 

Harvest Lane Associates 


Upgrading and Repairing PCs 

Scott Mueller (Que Books, Indianapolis, Ind., 1989, 724 pp., $27.95) 


August 1990 


125 





Object-Oriented Analysis 

Peter Coad/Edward Yourdon (Prentice Hall, Englewood Cliffs, N.J., 1990, 232 pp., $32.50) 


The literature on object-oriented ap¬ 
proaches contains a lot of material on 
object-oriented programming and object- 
oriented design but very little on object- 
oriented analysis. This book is really de¬ 
voted to filling this gap with the authors’ 
plentiful experience. 

During software development, dispar¬ 
ate notations and strategies for different 
process and data models have kept the 
two models separated. Because of this 
situation, Coad and Yourdon began re¬ 
searching and developing method nota¬ 
tions and strategies, which have been put 
into practice and refined into a systemat¬ 
ic method. 

The book first highlights the challenge 
of system analysis and discusses some of 
its complexity. It then outlines the four 
dominant methods — functional decom¬ 
position, dataflow, information model¬ 
ing, and the object-oriented approach — 
and explores a fully object-oriented pro¬ 
gramming language and environment 
(Smalltalk). 

The object-oriented analysis method is 
introduced in five steps: identifying ob¬ 


jects, identifying structures, defining at¬ 
tributes, defining connections, and defin¬ 
ing services. The book describes comput¬ 
er-aided software engineering support for 
object-oriented analysis. It also describes 
considerations for object-oriented design, 
as well as what happens to the object- 
oriented analysis model as design pro¬ 
ceeds. Finally, the book describes how to 
apply object-oriented analysis under 
DoD-Std-2167A. 

The book’s primary audience is prac¬ 
ticing systems analysts, designers, and 
programmers, although it could also be 
used successfully by some managers, 
testers, and end users. The authors assume 
readers have a fundamental understand¬ 
ing of structured system analysis, design, 
and implementation, and experience with 
such analysis tools as dataflow diagrams 
and entity-relationship diagrams. 

My own interest is in object-oriented 
techniques, and I think object-oriented 
analysis is a critical basis for object- 
oriented design and programming. As 
such, this book is practical and worthy of 
attention. The methods should help read¬ 


ers decrease the time needed to analyze 
system requirements in a software devel¬ 
opment project. 

The book is easy to understand and 
fairly self-contained, making it a good 
self-instruction book. Unfortunately, it is 
not suitable for classroom use because it 
provides neither practical exercises nor 
the necessary theoretical depth for an ad¬ 
vanced course. This is a “picture book” 
— providing a simple view of the tech¬ 
nique — much like Object-Oriented Sys¬ 
tems Analysis: Modeling the World in 
Data (Prentice Hall, 1988) by Sally 
Schlaer and Stephen J. Mellor. Coad and 
Yourdon’s book is better, although it 
also leaves too many questions unan¬ 
swered. 

Although Coad and Yourdon do not 
provide a formal or theoretical methodol¬ 
ogy for object-oriented analysis, I recom¬ 
mend their book to anyone involved in 
system analysis or the application of ob¬ 
ject-oriented techniques. 

Zheng Xiaojun 

Beijing Institute of System Engineering 


Dependability of Resilient Computers 

Thomas Anderson (BSP Professional Books, 1989, 261 pp., $58.50) 


This book, the second in a series on re¬ 
silient computing systems, provides a 
succinct introduction to the field and 
would be an excellent reference for the 
researcher. 

Each of the book’s 11 chapters is writ¬ 
ten by a recognized expert in a specific 
aspect of the field. Overall, the book can 
be divided into three parts. The first 
three chapters adopt a system view of the 
dependability concept. Chapters 4-8 then 
present some specific topics of resilient 
computing. Finally, Chapters 9-11 de¬ 
scribe some of the commercial systems 
that are marketed based on their resil¬ 
ience properties. 

The book is well written, with suffi¬ 
cient references at the end of each chap¬ 
ter. Most chapters in the first and second 
parts are problem introductions and sur¬ 
vey papers. Although most of the chap¬ 
ters are based on earlier papers by differ¬ 
ent authors, they are highly readable 
without being superficial. Chapter 1 de¬ 
serves particular attention. In it, Jean- 
Claude Laprie provides a coherent con¬ 
ceptual framework and terminology for 
dependability. He casts a new light on 


the concept by including reliability, safe¬ 
ty, and security as the three basic attri¬ 
butes of dependability. The only minor 
disappointment in this chapter is that 
some methods (such as fault-forecasting 
methods) are not given the same detailed 
attention as others (such as fault-toler¬ 
ance methods). 

Chapter 2 discusses the importance of 
good structure in system design, using 
material based mostly on a previous jour¬ 
nal paper. Chapter 3 considers depend¬ 
ability evaluation methods, providing an 
excellent survey of methods and tools as 
well as an extensive reference list. 

Chapter 4 is somewhat more theoreti¬ 
cal than the others, providing a formal 
approach to exception handling through a 
series of rigorous mathematical defini¬ 
tions and notations (an approach I favor). 
Chapter 5 gives an excellent survey of 
resilience in distributed systems and ad¬ 
dresses various problems by using the 
atomic action model. Chapter 6 address¬ 
es problems related to interactive consis¬ 
tency, or how to maintain the consistency 
of information sorted in a distributed 
system, including such problems as clock 


synchronization, atomic broadcast, and 
membership. Chapter 7 shows some in¬ 
teresting results from experiments with 
one of the fault-tolerant software tech¬ 
niques. Chapter 8 summarizes different 
fault-tolerance methods for concurrent 
software and includes a list of open prob¬ 
lems. 

Chapters 9-11 discuss some of the most 
common and popular resilient computing 
systems. 

A minor flaw in this book is the incon¬ 
sistency of some basic concepts in chap¬ 
ters by different authors. For example, 
Chapters 5 and 11 contain two different 
classifications of faults. Although these 
chapters classify faults from two differ¬ 
ent angles, and the two classifications are 
complementary, it is better to have a con¬ 
sensus throughout the book. Also, some 
material in the book overlaps the previ¬ 
ous book in the series. For example, the 
discussion of Stratus — a commercial re¬ 
silient computing system — appears as a 
chapter in each book. 

Jie Wu 

Florida Atlantic University 


126 


COMPUTER 






cSSSCS MAGAZINES 


July 1990 IEEE Computer 
Graphis & Applications 

Tutorial — Techniques for Cubic Algebraic 
Surfaces, Thomas W. Sederberg 

Feature Articles 

Radiosity Redistribution for Dynamic Envi¬ 
ronments, David W. George, Francis X. 
Sillion, and Donald P. Greenberg 

Visibility Determination on Projected Grid 
Surfaces, Sue-Ling Chen Wang and John 
Staudhammer 

Renderman: Pursuing the Future of Graph¬ 
ics, Anthony A. Apodaca and M.W. Mantle 

A Visualization Programming Environment 
for Multicomputers, Gary Bishop, Mark 
Monger, and Paul Ramsey 

A Dataflow Toolkit for Visualization, D. 

Scott Dyer 

Portability of Interactive Graphics Soft¬ 
ware, Donald L. Brittain 

NetCDF: An Interface for Scientific Data 

Access, Russ Rew and Glenn Davis 


July 1990 IEEE Software 

A Graphical Specification System for User- 
Interface Design, Andrew Harbert, William 
Lively, and Sallie Sheppard 

Drawing Dynamic Trees, Sven Moen 

IFS: A Tool to Build Application Systems, 

Kiem-Phong Vo 

Shared-Memory Parallel Processing in C++, 
Bob Beck 


Generating Test Data with Enhanced Con¬ 
text-Free Grammars, Peter M. Maurer 

Connecting Tools Using Message Passing in 
the Field Environment, Steven P. Reiss 

Algres: An Advanced Database System for 
Complex Applications, Stefano Ceri, Stefano 
Crespi-Reghizzi, Roberto Zicari, Gianfranco 
Lamperti, and Luigi A. Lavazza 


For subscription information, circle num¬ 
ber 200 on the reader service card. 


Magazine order form 

Please print or use peel-off label 

Quantity Price Total 



CG&A 



July '90 

Mail to: IEEE Computer Society Order Dept., 10662 Los Vaqueros Circle, 

Total 

PO Box 3014, Los Alamitos, CA, 90720-1264, 

(payment enclosed) 



1 SCIENTIFIC COMPUTING 

Managing Editor 

Peter Deuflhard 

IMPACT of Computing in Science and Engineering 

This interdisciplinary journal focuses on articles from the areas of mathematical and scientific 
modeling, scientific computing, computer science, and scientific and engineering applications. 

A Representative Selection of Articles 

Ch. Lubich 

^-Extrapolation Methods for Differential-Algebraic Systems of Index 2 

William D. Gropp and David E. Keyes 

Domain Decomposition on Parallel Computers 

J. Molenaar and P. W. Hemker 

A Multigrid Approach for the Solution of the 2D Semiconductor Equations 

Volume 2 (1990), 4 issues ISSN 0899-8248 In the U.S.A. and Canada: $80.00 All other countries: $92.00 


Computer Vision, Graphics, and Image Processing 

Expanding into two publications in 1991: 

Editors-in-Chief 

Norman Badler 
Rama Chellappa 

CVGIP: Graphical Models and Image Processing 

Focusing on synthesis methods and computational models underlying computer-generated or 
-processed imagery 

Editor-in-Chief 

Linda G. Shapiro 

CVGIP: Image Understanding 

Focusing on the computer analysis of pictorial information 

Please contact the Publisher for more information. 

S0269 

To receive a sample copy, privileged personal rates, or further information, please write or call: 

ACADEMIC PRESS, INC., Journal Promotion Department 

1250 Sixth Avenue, San Diego, CA 92101, U.S.A. 

(619) 699-6742 

All prices an in U.S. dollars and are subject to change without notice. 


August 1990 Reader Service Number 8 127 





















Executive Committee 

President: Helen M. Wood* 

National Oceanic and Atmospheric Administration 
FB 4, Rm. 1069, Code E/SP 
Washington, DC 20233 
(301)763-1564 

President-Elect: Duncan H. Lawrie* 

Past President: Kenneth R. Anderson* 

VP, Conferences and Tutorials: Laurel V. Kaleda (1st VP)* 
VP, Standards: Paul L. Borrill (2nd VP)* 

VP, Area Activities: Gerald L. Engel t 
VP, Education: Ronald G. Hoelzeman t 
VP, Membership and Information: Barry W. Johnson 1, 
VP, Press Activities: James H. Aylor 1 
VP, Publications: Sallie V. Sheppard* 

VP, Technical Activities: Mario R. Barbacci* 

Secretary: David Pessel* 

Treasurer: Joseph Boykin 1 
Division V Director: Edtoard A. Parrish, Jr. 1 
Division Vl(l Director: J. T. Cain 1 
Executive'Director: T. Michael Elliott 1 


Board of Governors 

Term Expiring 1990: 

Vishwani Agrawal, Mario R. Barbacci, 

Ming T. (Mike) Liu, Yale N. Patt, Donald E. Thomas, 
Benjamin W. Wah, Ronald Waxman 
Term Expiring 1991: 

P. Bruce Berra, Michael Evangelist, 

Ted Lewis, Raymond E. Miller, Earl E. Swartzlander, Jr., 
Joseph E. Urban, Thomas W. Williams 
Term Expiring 1992: 

Alicja I. Ellis, Tadao Ichikawa, 

David Pessel, Sallie V. Sheppard, Bruce D. Shriver, 
Harold Stone, Wing N. Toy 

Next Board Meeting 

November 16,1990, 8:30 a.m. 

New York Hilton, New York, NY 

Senior Staff 

Executive Director: T. Michael Elliott 
Editor and Publisher: H. True Seaborn 
Director, Computer Society Press: Eugene M. Falken 
Director, Conferences and Tutorials: Anne Marie Kelly 
Director, Finance and Administration: Tod S. Heisler 
Director, Board and Administrative Services: Violet S. Doan 

Computer Society Offices 

Headquarters Office 

1730 Massachusetts Ave. NW 
Washington, DC 20036-1903 
Phone (202) 371-0101 
Fax:(202)728-9614 

Publications Office 

10662 Los Vaqueros Cir. 

PO Box 3014 

Los AlamitOS, CA 90720-1264 
Membership and General Information: (714) 821-8380 
Publication Orders: (800) 272-6657 
Fax:(714)821-4010 
European Office 
13, Ave. de L’Aquilon 
B-1200 Brussels, Belgium 
Phone: 32 (2) 770-21-98 
Fax: 32 (2) 770-85-05 
Asian Office 
Ooshima Building 
2-19-1 Minami-Aoyama, Minato-ku 
Tokyo 107, Japan 
Phone: 81 (3)408-3118 
Fax: 81 (3) 408-3553 


Use the Reader Service Card to obtain information on: 

• Membership application—student #203, others #202 

• Perodicals subscription form for individuals #200 

• Periodicals subscription form for organizations #199 

• Publications catalog #201 

• Compmail II electronic mail brochure #194 

• Technical committee list/application #197 

• Chapters lists, start-up procedures—student/regular #193 

• Student scholarship information #192 

• Volunteer leaders/staff directory #196 

• IEEE senior member grade application #204 (requires ten 
years practice and significant performance in five of those 
ten.) 

To check membership status or report a change of address, 
call the IEEE toll-free number, 1-800-678-4333. Direct all other 
Computer Society related questions to the Publications Office. 

Purpose 

The IEEE Computer Society advances the theory and practice of 
computer science and engineering, promotes the exchange of 
technical information among 100,000 members worldwide, and 
provides a wide range of services to members and nonmembers. 

Membership 

Members receive the acclaimed monthly magazine Computer , 
discounts, and opportunities to serve (all activities are led by 
volunteer members). Membership is open to all IEEE members, 
affiliate society members, and others seriously interested in the 
computer field. 

Publications and Activities 

Computer. An authoritative, easy-to-read magazine containing 
tutorial and in-depth articles on topics across the computer field, 
plus news, conferences, calendar, interviews, and new products. 

Periodicals. The society publishes six magazines and five 
research transactions. Refer to membership application or request 
information as noted above. 

Conference Proceedings, Tutorial Texts, Standard 
Documents. The Computer Society Press publishes more than 
100 titles every year. 

Standards Working Groups. Over 100 of these groups produce 
IEEE standards used throughout the industrial world. 

Technical Committees. More than 30 TCs publish newsletters, 
provide interaction with peers in specialty areas, and directly 
influence standards, conferences, and education. 

Conferences/Education. The society holds about 100 
conferences each year and sponsors many educational activities, 
including computing science accreditation. 

Chapters. Regular and student chapters worldwide provide the 
opportunity to interact with colleagues, hear technical experts, and 
serve the local professional community. 

European Office 

Payments for Computer Society membership and publication 
orders are accepted by checks in Belgian, British, German, Swiss, 
or US currency. Checks in US funds must be drawn on a US bank. 
Payment may also be made by American Express, MasterCard, or 
Visa credit cards. 

Asian Office 

Payments for Computer Society membership and publication 
orders are accepted by checks in Japanese or US currency. 

Checks in US funds must be drawn on a US bank. Payment may 
also be made by electronic fund transfer to the Bank of Tokyo, 
Akasaka Branch, Toza acct. 0767956; the credit receiver is the 
IEEE Computer Society Headquarters Office. Payment may also be 
made by American Express, MasterCard, or Visa credit cards. 

Ombudsman 

Members experiencing problems — magazine delivery, 
membership status, or unresolved complaints — may write to the 
ombudsman at the Publications Office. 


IEEE COMPUTER SOCIETY 

A member society of the Institute of Electrical and Electronics Engineers, Inc. 










CALL FOR PAPERS 



18th International Symposium on 


Computer Architecture 

Toronto, Canada 
May 27-30, 1991 


sponsored by 

. IEEE Computer Society 

Institute of Electrical and Electronics Engineers (IEEE) 
IEEE acm Association for Computing Machinery 

SYMPOSIUM COMMITTEE 



Submitted papers will be accepted for consideration 
until November 21, 1990. Five copies of the double¬ 
spaced manuscript, in English, not exceeding 6000 
words in length, should be sent to the Program Chair. 
A single cover sheet should be included which contains: 
paper title, full names, affiliations, complete 
addresses, and phone numers of the authors. E-mail 
addresses should be included if available. 


Conference Secretary: Tutorial Chair: 



416-925-7231 Electrical Engineering 


University of Toronto 
416-978-5033 


Because the identity of the authors will not be revealed 
to the referees, authors’ names and affiliations should 
appear only on the cover sheet. Authors should avoid 
references and citations that compromise anonymity. 



on any aspects of Computer Archi- 
include, but are not limited to, 
liitectures 
techniques 

parallel architectures 
derating systems support 
architectures 
don and measurement 
on architecture 

architectures 


Notification of acceptance will be given by February 
8, 1991. Authors of accepted papers will be requested 
to submit a final, camera-ready copy by March 11, 1991. 

Tutorials will be held on May 27, 1991. Send five 
copies of proposals for full or 1/2 day tutorial 
to the Tutorials Chair to be received by November 
28, 1990. Proposals should include: tutorial 
title, outline, brief description of topics to be 
covered, intended audience, assumed attendee back¬ 
ground, and a resume of the speaker. 


Very-high 
















CALL FOR PAPERS 

The Fifteenth Annual International 

Computer Software and 
Applications Conference 

comrtsacSl 



TUTORIALS: September 9-10,1991 • CONFERENCE: September 11-13,1991 


Co-sponsored by 






INFORMATION PROCESSING 
SOCIETY OF JAPAN 


Conference Co-chairmen 

Yutaka Ohno 

Advanced Technology and Mechatronics 
Research Institute of Kyoto (ASTEM RI) 
Kyoto, Japan 

Dick B. Simmons 
Texas A&M University 
College Station, Texas, USA 

COMPSAC Steering Committee Chairman 
Stephen S. Yau 
University of Florida 
Gainesville, Florida, USA 


Program Co-chairmen 

Papers from Asia, Australia and 
New Zealand send to: 

Motoei Azuma 
Waseda University 

c /o Business Center for Academic Societies of Japan 
3-23-1 Hongo, Bunkyo-ku, Tokyo 113, Japan 
Telephone: (81) 3-817-5831 
Telefax: (81) 3-817-5836 


Papers from North and South Americas, 
Europe and Africa send to: 

Lionel M. Ni 

Michigan State University 
Department of Computer Science 
A714 Wells Hall 

East Lansing, Michigan 48824-1027, USA 
Telephone: (517) 353-4386 
Telefax: (517) 336-1061 
Internet: ni@cps.msu.edu 


IMPORTANT DEADLINES: 

■ January 12,1991 all papers and panel proposals received 

■ February 12,1991 panel organizers notified of acceptance 

■ March 1, 1991 organizers of accepted panel proposals to provide 
final information on session chairperson and panelists 

■ March 17,1991 authors notified of acceptance 

■ May 15, 1991 camera-ready copies of accepted papers and pan¬ 
elists' position papers due 


INFORMATION FOR AUTHORS: 

■ 6 copies of the full paper (single or double sided) 3000-5000 words 

■ Papers should include: title, authors, affiliations and 150-word ab- 

■ Identify the author responsible for correspondence, including 
title, affiliation, address, telephone number and fax number 

INFORMATION FOR PANEL ORGANIZERS: 

■ 6 copies of panel proposal 

■ Proposals should include: title, organizer's affiliation, address, 
telephone number, fax number, a 150-word scope statement and 
proposed panelists. 


Papers and Panel Session Proposals related, but not limited to, the following areas are invited: 


Software Tools for Parallel and Distributed Systems 

■ Programming Languages and Compilers 

■ Programming Environments 

■ Modeling and Analysis Tools 

■ Visualization Tools 

■ Debugging Tools 

■ Validation and Verification Tools 

■ Reliable and Real-Time Software Systems 

Software Methodologies, CASE Tools, and Management 

■ Requirements and Specifications 

■ Metrics and Measurements . 

■ Quality Assurance and Control 

■ Maintenance and Reusability 

■ Project and Configuration Management 

■ Modeling and Fast Prototyping 

■ Testing and Verification 


Object-Oriented Software Developments 

■ Development Environment 

■ Programming Languages 

■ Visual Languages 

■ Database Management 

■ Human Computer Interface 

■ Practical Applications 


Knowledge and Database Based Systems 

■ Integrated Knowledge-Based Systems 

■ Practical Export Systems 

■ Application of Neural Networks 

■ Multimedia Database Systems 

■ Software Methodologies for Expert System- 

■ Computer Security 





















